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O SO AMA SN 


X PRELIMINARIES 


Sense perception is too variegated and complex to be a good starting point 
for the philosophy of science. Much of it is but barely cognitive. We are 
continually seeing whatever lies at the edge of our visual field, hearing the 
steady background noise of the city, touching our shirts with our shoulders, 
yet hardly ever have we derived any knowledge from such careless, confused 
perceptions. Nor shall we normally achieve any sctentific knowledge through 
the diligent, discriminative perceptions that go into the contemplation ofa 
work of art. I believe, therefore, that instead of trying to cope with 
perception in all its rich variety, the philosopher of science will do well to 
concentrate on the one form of it that is directly relevant to his subject, 
namely, the attentive, deliberate, explicitly cognitive mode of perception 
that goes under the name of observation. 

To avoid needless exceptions to what I shall be saying, I restrict the 
meaning of the word to the observation of physical objects—in the widest 
sense, f.e. things, states, processes, events, and their properties and 
relations. This includes, e.g. the observation of where my back aches or how 
my heart throbs, but does not include the ‘introspection’ of disembodied 
mental states and processes, if there is such a thing. I shall also exclude from 
the scope of observation the access to physical objects by so-called extra- 
sensory perception (ESP). 

I take for granted that observation always involves a physical process 
which links in a causal chain the object or objects observed to a physical 
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system, that I shall call the receiver, in which the effects of the said process 
are recorded. Observation can be a source of knowledge only if it also 
involves awareness. Such awareness must be, of course, a state of a person, 
the observer, who is thereby enabled to learn from observation. 

For want of a better word, I shall refer to observations with and without 
awareness as personal and impersonal observations, respectively. In personal 
observation the receiver is always the observer’s body—or a part of it—but 
in impersonal observation it can be a wide variety of things: a photographic 
camera, a voltmeter, a telephone-bugging device, even a human being (e.g. 
the tasters employed in ancient courts to test foods for poison). For the 
result of an impersonal observation to become known, an observer must 
indeed eventually observe the receiver with his own senses. 

In personal observation the observer always pays attention to something 
that is there before his eyes, or within reach of his ears, or at the tip of his 
fingers, etc. I call it the direct object of observation and say that it is being 
directly observed. The observer’s main concern, however, may be some- 
thing else, typically the object of an impersonal observation whose receiver 
he observes personally. I call this the indirect object of the personal 
observation, and say that it is indirectly observed by the observer. The 
distinction between direct and indirect observation is crucial to philoso- 
phers who wish to play down the importance of the intellectual factor in 
experience, or to deny it altogether. For it is plain that only direct observa- 
tion might concetvably glean information from sense awareness alone, not 
supplemented by thought. In indirect observation the observer must rely on 
his previous knowledge—usually well padded with hypotheses—of the 
relation between the object directly accessible to him and the one he seeks to 
reach through it, in order to gain cognition of the latter through awareness of 
the former. But it is not easy to say just at what point observation becomes 
indirect, and hence dependent on memory and reason. It is clear that 
someone who alternatively sniffs the back of his hands to choose between 
two scents that have been sprayed on them practices direct observation, 
while someone who peruses a collection of cross-section vs. energy graphs 
looking for evidence of a new particle is involved in indirect observation. 
However, there are borderline cases in which the distinction between direct 
and indirect observation becomes fuzzy and can only be enforced by 
enjoining a strained reading of the definitions. Thus, I can test the hardness 
of a surface by pressing against it with the tip of my fingers, but, if I am not 
near enough I can also test it with the tip of a walking-stick I hold firmly in 
my hand. Shall we say that the latter is an instance of indirect observation 
because a dead stick is part of the receiver? But the stick plays a role in 
observation only in so far as it forms a single rigid system with my forearm; 
and the hardness of the surface is recorded in my joints, just as when I test it 
with my fingers. On the strength of a similar analogy, I shall say that I can 
read a definition of the OED no less straightforwardly with a powerful 
magnifying glass in the compact edition than with my ordinary eyeglasses in 
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the standard one—whereas reading with my naked eyes is a form of direct 
observation I have long ago learnt to avoid as being both tiresome and 
deceptive. Grover Maxwell [1962] built his claim that microscopic objects 
are directly observable on the gradual and seemingly continuous transition 
from ‘looking through a window pane’ to ‘looking through a high power 
microscope’. However, as Ian Hacking ([1983], pp. 194 Jf.) aptly reminds 
us, the analogy must break down somewhere on the way, because vision 
through a microscope does not repose on quite the same optical processes as 
ordinary vision: while in the latter information is relayed from the object to 
the retina mainly by the reflection, transmission and refraction of light, 
looking through a high power microscope essentially involves light diffrac- 
tion and interference. Be that as it may, if thoughtless observation either 
does not exist or lacks epistemic value, the distinction between direct and 
indirect observation does not have much epistemological significance. (That 
the distinction is of virtually no importance in the actual practice of science 
is shown by the fact, rightly stressed by Shapere [1982], that physicists 
normally call ‘direct observation’ much that, according to the above 
stipulations, ought to be called ‘indirect’.) 


2 SOME FEATURES OF PERSONAL OBSERVATION 


A personal observation involves a peculiar mode of awareness, distinct, say, 
from fear or recollection, and from other forms of perception, such as 
watching a movie, or basking in the sun on a beach. As words are used in 
English, a personal observation must also involve a physical process of some 
kind or other, by which the observed object acts on the observer’s body. If 
this physical process of observation is lacking, a person who is aware of 
observing is said to hallucinate. Indeed one will disqualify as hallucinatory 
any case of purportedly observational awareness in which an object acts on 
the observer’s body in a manner that would not normally bring about that 
awareness, e.g. if the observer is visually aware of a statue that he is touching 
in the dark. 

We shall examine later the general characteristics of the physical process 
constitutive of observation, both personal and impersonal. But first let us 
take up personal observation alone, as it is disclosed in that mode of 
awareness which is proper and peculiar to it. Like other deliberate, attentive 
forms of consciousness, observational awareness involves self-awareness. 
The observer must somehow be aware of observing, or else he cannot 
properly be said to observe. (I say ‘somehow’ to make allowance for 
situations in which the observer is so absorbed by his task that he becomes 
oblivious of himself; even in such cases, if he is really observing, he will 
become aware of it as soon as he stops a moment to reflect on what he is 
doing.) Being immediate and not liable to control by others, self-awareness 
has often been pronounced incorrigible. This cannot imply, however, that 
every statement inspired by self-awareness exactly says what the speaker is 
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aware of, and neither requires nor admits improvement. Thus, when I now 
proceed to describe the structure of observational awareness, although I do 
not see how one could question my evidence for what I shall say—namely, 
my own self-awareness of what I call my observations—I do not doubt that 
one would find better words to convey that evidence. 

I shall list some structural features of personal observation which, in my 
view, deserve special attention. Then, in the next four sections, I shall seek 
to elucidate them. 


(1) Every personal observation is an episode in the observer’s life. As 
such it takes time, it involves recollections and expectations, it is 
guided by the observer’s memory and habits, and it is meant to serve 
his purposes. 

(2) In every observation, the observer’s attention falls on something he is 
currently aware of, which we have agreed to call ‘the direct object’ of 
the observation. 

(3) The observer grasps the object—be it direct or indirect—as a 
particular instance of some universal. 

(4) The direct object of observation is never simple. Its parts and aspects 
stand to one another and to other directly observable objects in 
relations of time and place. 


3 PERSONAL OBSERVATIONS ARE EPISODES IN THE 
OBSERVER’S LIFE 


If the analysis of this truism under (1) is granted, there can be no such thing 
as an instantaneous, simple, isolated, presuppositionless or purposeless 
personal observation. This disjunction sets definite limits to the claims of 
empiricism. Knowledge cannot rest on observation alone if all observation is 
conditioned by the observer’s interests, preferences, beliefs and former 
knowledge. 

Note that I do not assert that all perceptions are thus conditioned. It might 
be plausibly argued that the earliest perceptions of babies are assumption- 
less and value-free, if only because they are the earliest, and are still too 
vague and unexpected to fall into any kind of order. Against such arguments 
we need not resort to our different but probably no less dubious insights into 
the minds of babies. It is enough to point out that such earliest perceptions 
are not observations, and that, even though they may help to build habits 
which will later support the cognitive use of the senses, they can hardly be 
said to yield knowledge. 

On the other hand, it seems to me fairly obvious that no mode of human 
awareness can lack duration or complexity, or can exist in isolation 
(although this too has been questioned, as Peirce [193 1—58], Section 7.629, 
recalls). This holds therefore for all perception. However, Hume asserted 
that our senses ‘convey to us nothing but a single perception, and never give 
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us the least intimation of anything beyond’ (Hume [1739], p. 189). I do not 
find that Hume’s claim does justice to my observational awareness. 
Anything that I am capable of distinguishing as a ‘single perception’ —be it 
the furry surface of the carpet under my feet, or the colourful backs of the 
books on the shelf in front of me, or the shrill sound of the car alarm beaming 
from the parking lot beneath my window—points as such to a multitude of 
things beyond it. It may be argued that it does not do so of itself, but only 
because it is woven into a complex of experiences and expectations in the 
midst of which it turns up as a perception. But if I shut myself from, or 
ignore, or ‘forget’ that complex, there would be nothing. left for me to 
pinpoint as a ‘single perception’. Thus, except for the marginal cases we 
shall consider near the end of Section 4, whatever my senses succeed in 
conveying to my attention is self-transcending—it is a signpost cluttered 
with intimations of the world. 


4 THE DIRECT OBJECT OF OBSERVATION 


Though nobody doubts that any observation has a direct object, there has 
been much controversy about its nature. Let me introduce the issue through 
an example. I own two pairs of eyeglasses which enable me to focus my eyes 
at different distances. Suppose now that I stand in Madrid’s Plaza de 
Oriente and want to decide which of my eyeglasses will be most suitable for 
examining the statue of King Philip IV. I wear them alternatively and try to 
ascertain which of them gives a clearer view of the embroidery at the tip of 
the King’s sash. The question in dispute can then be put as follows: When I 
change glasses, do I change the direct object of observation, or only the 
aspect or guise under which I see it? If we are to abide by ordinary English 
usage, there is no doubt as to the right answer to this question: My attention 
has been directed all the time to the same bronze sash, hanging motionless 
from the statue above me, and it looks slightly different, now neater, now 
woolier, because I see it through different glasses. However, some 
philosophers have wished to reform ordinary language at this point, at least 
for the purposes of philosophical discourse, and would have us say, in the 
case of my example, that the direct object of observation, #.e. the object I am 
currently aware of and paying attention to on each occasion, is the multiply 
connected black spot at the centre of my visual field that I take to be an aspect 
or Abschattung of the sash, and that, since the spot is sharper when I wear . 
one pair of glasses, foggier when I wear the other pair, it is evident that the 
direct object of observation is, on each ocasion, a different one. As to the 
statue, such philosophers would say that—regardless of what, if anything, it 
may be in itself—as an object of human knowledge it is constructed from the 
aforesaid and many other likewise directly observed objects and also 
perhaps from non-observational ingredients. 

We need a collective name for the entities to which direct personal 
observation is confined by our language reformers. ‘Phenomenon’ is a time- 
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honoured name for them, which instantly yields ‘phenomenalism’ as a 
denomination for their sponsors. But ‘phenomenon’ is currently used for 
physical occurrences which a person can only observe indirectly, by means 
of thoughtfully contrived devices (cf. Hacking [1983], pp. 220 ff.). I shall 
therefore call them sense appearances, although this name also suggests 
some wrong connotations. 

To my mind, the main motive for restricting the immediate scope of 
personal observations to sense appearances is that one thereby secures for its 
direct objects a purity from intellectual contaminants that cannot be 
matched by the material bodies which, in plain English, we are said to 
observe directly. I shall describe some imaginary—yet practically 
unimaginable—situations to which one would willingly apply the new 
language or in which one might dismiss the entire issue as a purely verbal 
one. The difference between these situations and our own condition will 
make clear, I hope, why the proposed reform is not viable, why we cannot 
persist in speaking, for any significant length of time, as if our senses made 
us aware of sense appearances only. 

Consider a purely contemplative observer, who sees static scenes, one 
after the other. He would have little or no inducement to analyse the scenes 
into parts or to associate parts of different scenes, unless such parts 
happened to be equal. In the latter case, there would be no reason for 
distinguishing the object of observation from its momentarily perceived 
aspects. Suppose now that the scenes observed change gradually and flow 
into each other, as in a motion picture. The observer could then perhaps 
discern patterns in the flow and come to view parts of successive scenes as 
diverse aspects of the same object. Such an object, however, would be no 
more than the series of its presentations, or rather the law of that series. A 
Humean analysis—which the circumstances obviously would call for— 
would unmask such laws, exposing them as mere habits. We can add sound 
and even smells to the motion picture without changing the situation 
essentially. We humans differ, of course, from a purely contemplative 
observer in that we have an interest, often a vital one, in the objects we 
perceive, and are sometimes also able to change them. But even if we let our 
fictitious observer resemble us in this, if we allow, say, some of the movie 
sequences which are all his experience to become pleasant or painful, and we 
let him will and occasionally achieve the removal of pain, the renewal of 
pleasure, he still would not be one of us. For he lacks the complex array of 
muscular, postural, thermal, tactile experiences in which we perceive 
ourselves as bodies, incessantly interacting with other bodies, dangerously 
exposed to them, and also, through that very interaction, capable of 
manipulating them and observing them. The pencil I hold in my hand and 
press between my fingers, the chair I sit in, the table I write on, are grasped 
in observation as true bodies because through the pressure I exert on them, 
the movements I do against them, the thermal gradients they generate on my 
skin, I sense their bodily presence on a par with my own. Dr Johnson refuted 
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Berkeley by kicking a stone. The Greeks fought Pyrrho’s sceptical doubts by 
letting a dog loose on him. Professionals smile with condescension at such 
wordless arguments, but there is a wisdom in them. Macbeth would clutch 
the dagger which he saw before him, or else dismiss it as ‘a dagger of the 
mind’. 

Awareness of our interaction with the bodies surrounding us is the key to 
our construal of personal observation as a physical process, with our body as 
the receiver. It is convenient to recall how this construal introduces a 
measure of order and consistency into the diverse and often baffling 
appearance of things. If the observer becomes aware of the physical objects 
about him by their action on his body, his observational awareness must 
depend not only on the objects themselves, but also on the condition of his 
body and all other circumstances influencing the observation process. Thus, 
our grasp of the physical basis of vision enables us to understand why a 
Gothic steeple should appear so different through the fog or under a blazing 
sun, why a perfectly straight pencil should show a kink when partially 
submerged in 2 glass of water, why the police van catching up with my car 
should turn up in the mirror in front of me, why a supernova should flare up 
in the sky now in the direction where it faded out forever several million 
years ago. On similar grounds we can account also for our seeing visibles 
(and hearing audibles, etc.) which are not judged to be an aspect of anything, 
such as the red, semitransparent dics that we see wherever we direct our eyes 
after we have been looking intently for a while at a strong source of light, or 
the colourless little worms that we see wriggling about in the air if we stare at 
a bright cloudy sky. Since visual (acoustic, etc.) awareness closely depends 
on the state of our body it is to be expected that it will often be stirred by 
changes in that state which are not a part of any process of observation (just 
as, say, a short-circuited loudspeaker will emit a noise which is not a part of 
the music being played). It is fortunate, indeed, that such occurrences, 
though frequent, rarely become obtrusive. But it is a perversion of 
philosophy to choose such marginal events as the prototype of all our sense 
experience, and then to wonder how it may come to pass that by far the 
greater part of it is so neatly ordered as a display of physical objects. In fact, 
outside this order in which we normally perceive things in their manifold 
aspects, it is hard to conceive that there could even exist an awareness of 


objectless sensibilia.. 


§ CONCEPTUAL GRASP OF THE OBSERVED OBJECT 


Throughout the preceding discussion I have implicitly appealed to the 
principle stated on page 4 under (3): The observer grasps the object as a 
particular instance of a universal. By this I do not mean to deny that we do in 
certain circumstances perceive individuals as such, and not as members of a 
class. When talking to a close friend, or laughing together, or holding her or 
him in our arms, we are often aware only of the unique individual we are 
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with, and not of any universals that she or he instantiates. Awareness of 
individuality in its irreplaceable oneness is, I surmise, a necessary condition 
of any genuine personal relationship. I contend, however, that such 
awareness is not observational, and that as soon as one begins to observe 
one’s partner, her teeth, her accent, her syntax, one does in effect subsume 
her, or at any rate that aspect of her which one is observing, under a general 
concept. This feature of observation which, for brevity, we may call the 
principle of conceptual grasp, holds for all objects of observation, both 
direct and indirect. It is indeed especially obvious in the case of impersonal 
observation by means of artificial contrivances. Such artifacts cannot 
properly yield data unless the observer conceives of them as physical 
systems of a certain kind, which interact according to certain laws with 
objects of the class under investigation. It is therefore well nigh impossible 
to figure out how observation can be a source of science, unless it obeys the 
principle of conceptual grasp, and the observer gua observer approaches the 
particular objects of observation with a view to their generality. 

If concepts go into every observation, then empirical knowledge is 
intellectual through and through. This was already implicit in Kant’s 
dictum that sense awareness without concepts is blind. It raises several 
philosophical problems, on two of which I shall briefly touch now. 

The first is the problem of noogony, or the origin of concepts. These 
cannot all be obtained from observation by the classical procedures of 
comparison, reflection and abstraction. In each observation some concepts 
must be at work from the very outset. Each time we revise a judgment some 
concepts must remain stable. Every exercise of our understanding thus 
involves concepts which are, so to speak, locally a priori. We need not hold 
that any concepts are permanent, or global in scope; but it is clear that the 
doctrine that all concepts proceed from observation would entangle us in an 
infinite regress. 

The second problem I wish to mention can be stated as follows: Every 
conceptual grasp of an object of observation is liable to revision and 
correction in the light of other observations. What justifies our preference 
for some observations above others? How can we judge that we have 
achieved a better conceptual grasp? This problem was miraculously 
solved—-or rather dissolved—by the positivist dogma of Immaculate 
Perception, according to which we can observe virgin data, unpolluted by 
our fallacious intelligence, and adjust our concepts and judgments to them. 
If we could ever make a perfectly self-contained observation, not signifying 
anything beyond itself, there would certainly be no means—and no 
motive—for revising it. One would just blow away the conceptual chaff and 
leave the observational grain alone in its Parmenidean immutability—and 
irrelevance. But, as a matter of fact, none of our observations can be thus 
isolated. Each of them is constitutively linked by concepts to other 
observations welded into a complex network of assumptions and beliefs, 
together with which it gives rise to a wealth of expectations. Failure of 


Observation 9 


expectations is ultimately perhaps the only inducement for revising and 
correcting our observations. But the revision of an observation is not 
effected only in the light of posterior observations of the unexpected. It is 
usually assisted also by the record of other past observations which may be 
more detailed, more careful, or more consonant with each other. 
Consonance and detail yield unquestionable, more or less unambiguous 
criteria of preference. But to say that an observation is more careful than 
another one would seem to presuppose the choice that we seek to justify. 
However, some observational procedures may well be deemed more careful 
than others if they normally lead to more successful expectations. Moreover, 
our present, well-corroborated understanding of the physical processes of 
observation provides definite and, within that understanding, unim- 
peachable grounds for judging the reliability of observations. How one goes 
about using such diverse criteria in the progress of experience is well known 
to us from our daily lives. More sophisticated examples are provided by the 
history and the current practice of scientific research. By recalling these 
generalities about the revision and mutual illumination of observations I am 
not trying to usurp for philosophy the role of the scientific methodologist— 
who sifts and systematises the procedures for collating observation data and 
the criteria for judging their worth—but rather to clear the way for him. 
What we ought to bear in mind is that, if all our knowledge of physical 
objects is corrigible, it must be self-correcting, for there is no outside 
authority to which one could turn for help. I find therefore that, important 
as it is, Quine’s recognition that ‘our statements about the external world 
face the tribunal of sense experience not individually, but only as a corporate 
body’ ([1961], p. 41) is not sufficiently thorough. For in the trial of empirical 
knowledge the defendants are at once the prosecution, the witnesses and the 
jury, who must find the guilty among themselves with no more light that 
they can all jointly put together. 


6 OBSERVED RELATIONS OF TIME AND PLACE 


Relations of time and place are among the most pervasive universals under 
which physical objects are grasped in observation. There is no question that, 
if we directly observe physical objects, such relations between them are also 
observed directly. As I stand in front of a class I see rows of students sitting at 
their desks between me and the back wall. I see a student raise both his hands, 
one after the other, and then simultaneously put them down. This is the 
proper way of describing in ordinary English what I see, and I can discern no 
reason to reform it. Indeed, even those who maintain that physical objects 
are not observed directly, but are constructed from sense appearances, 
would normally grant that the latter stand in directly observable time and 
place relations of their own. I shall discuss some such views in the second 
half of this Section. But first let us proceed along the common sense line we 
have been following. 
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In any observation I grasp the direct object of my attention amidst other 
physical objects—always including my own body—to which it stands in 
more or less definite relations of place and time. It is a familiar fact of life that 
the place and time relations involving diverse objects personally observed by 
different observers on different occasions mesh together into a single 
relational net on which each observer has, so to speak, a permanent grip. 
Consonance with this network of time and place relations is one of our 
preferred criteria for judging our grasp of particular observations. It is 
chiefly through the steady presence of this network that the world of posstble 
objects of personal observation is held as a permanent background to 
whatever we happen to be actually observing. Thus, when I stand in the 
classroom where I usually teach, looking at the door, I expect to find behind 
it the corridor leading to the stairs that go down to the street where my car is 
parked. Of course, it could turn out that while I spoke in the classroom the 
rest of the city had been wiped out by a surgical nuclear strike; but I would 
expect to find behind the door, even in that case, the place where the corridor 
had been. Annihilation of that place itself, though not a logical impossibility, 
cannot be contemplated without a drastic change in the very concepts with 
which we grasp our surroundings. 

Like other relational systems, the network of time and place relations 
between directly observable physical objects may be conceived as an 
abstract structure by ignoring the irrelevant peculiarities of the entities that 
actually hold the relations. This can be done as follows: Let us say that two 
directly observable physical objects are P-equivalent (P for place) if they can 
exchange places while all relations of place between directly observable 
physical objects remain otherwise unchanged. (In other words, two such 
objects, A-and B, are P-equivalent if and only if the truth-value of every 
proposition concerning relations of place between directly observable 
physical objects remains unchanged when A and B are annihilated and 
recreated at each other’s place and the references of all names of A and B are 
mutually exchanged.) P-equivalence partitions the world of directly ob- 
servable physical objects into P-equivalence classes. We regard all members 
of a given P-equivalence class as copies of a prototype and forget their 
individual differences. The list of P-equivalence classes and the relations of 
place between members of such classes is then a structure, which we may call 
the space of direct observability. The structure thus conceived would be quite 
unwieldy, but it could be greatly simplified if one succeeded in defining all 
P-equivalence classes in terms of a few. In a similar way, we might 
characterise T-equivalence (T for time) and PT-equivalence (PT for place- 
and-time) between directly observable physical objects in terms of their 
mutual replaceability with respect to time relations or with respect to place 
and time relations, respectively. These equivalences would then generate 
two structures which we may call the time and the spacetime of direct 
observability. 

Assuming that the said structures are conceivable, one may ask what 
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relation exists between the time, the space and the spacetime of direct 
observability and the familiar time, space and spacetime manifolds in which 
mathematical physics deploys its objects. To give an inkling of what is at 
stake with this question, I shall sketch three conceivable answers. For 
brevity’s sake I shall speak only of the relation between the spacetime of 
direct observability (STO) and the spacetime of mathematical physics 
(STMP); the other two pairs of structures can be treated similarly. The 
most straightforward answer is that STDO is an open submanifold of 
STMP. It entails that every directly observable event has a neighbourhood 
in STDO which is homeomorphic to R*—a rather strong requirement, 
implying that any observable time lapse is not only infinitely divisible, but 
also continuous in the strict mathematical sense. Another, less stringent 
answer is that STDO can be identified in a reasonable, non-trivial way to a 
structure obtained by choosing some suitable subset of STMP and 
restricting to it the constitutive relations of STMP. In this case, STDO 
could be said to be embedded in STMP in the same sense in which a finite 
geometry consisting, say, of a few dozen coplanar points and lines, plus their 
relations of incidence, concurrence and collinearity, is embedded in the 
Euclidean plane. STMP would then be essentially richer than STDO but in 
every way compatible with it, so that we could say that STMP is an 
extension of STDO. A third answer seems to me more plausible than the 
former two, namely, that STMP is an extension not of STDO, but of an 
idealisation of STDO, i.e. of a structure defined by ignoring some 
constitutive features of STDO (e.g. the direction of time), and resolving 
by fiat the fuzziness of others (e.g. metric relations between observed 
objects). 

If one believes that only sense appearances are directly observed and that 
physical objects are constructed from them by the human mind, it is natural 
to assume that the place and time relations of the different types of sense 
appearances constitute distinct structures—vr2z. a yisual space, a tactile 
space, etc.—and to inquire into their peculiarities. It is indeed surprising 
that such inquiries should hitherto have been confined to place relations 
only, and that we have yet to hear about separate perceptual times, or 
spacetimes. It has been claimed repeatedly, on the strength of both armchair 
philosophy (Reid [1764]) and experimental psychology (Luneburg [1947], 
Blank [1953], [1958a,b], Battro et al. [1976]), that visual space, i.e. the 
abstract structure of place relations between visual percepts, has a definite 
geometry, incompatible with the geometry of physical space developed by 
common sense and perfected by mathematical physics, seemingly on the 
basis of haptic (tactile and kinaesthesic) perceptions alone. Such a radical 
distinction between visual and physical space provides a very strong 
argument for the thesis that physical objects are not seen but inferred. It can 
also yield plausible solutions to some classical puzzles regarding visual 
illusions and multiple or delayed images. If the places I see or hear 
correspond to but are not identical with the places among which I stand and 
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move, I should normally expect some measure of incongruence between 
their respective contents. 

Now, while Reid—followed in our days by R. B. Angell [1974]— 
maintained that visual space is a two-dimensional spherical space, 
Luneburg and Blank claimed that it is a hyperbolic three-dimensional space, 
and Battro and his associates attributed to it a three-dimensional 
Riemannian geometry of non-constant curvature (ranging from +1 to —1). 
In the light of what I said about the relations between STDO and STMP it 
should at once be clear that such claims cannot be taken literally, for a 
Riemannian space of two or three dimensions and either constant positive, 
constant negative or non-constant curvature must anyway have a topology 
and a differentiable structure which no visual data can warrant, and also 
implies exact metric relations which can only be roughly approximated by 
the gross estimates of size and distance that unaided vision is capable of 
providing. Thus, the most that can be claimed by the rival theorists of visual 
space is that their favourite geometries are optimal idealisations of the 
structure displayed by visual percepts. My reservations with regard to them 
arise not so much from their mutal disagreement as from the nature of their 
methodologies. I do not think that one can reach valid conclusions about 
what a healthy grown-up person normally sees, by reasoning a priori from 
the premise that the human eye works like a movie camera turning on a fixed 
pivot, or by collecting laboratory data under such far-fetched experimental 
conditions as I shall describe below. 

Reid builds a priori his ‘geometry of visibles’ on the assumption that 
visual appearances are placed about ‘the eye’—in his chapter on the subject 
Reid writes this word in the singular, as though he were philosophising 
about Cyclopes—which knows nothing of their radial distances, but 
ditinguishes the directions in which they lie. As ‘the eye’ rotates about its 
centre, the places it sees range over a two-dimensional Riemannian 
manifold, isometric to the sphere. To reach this conclusion, Reid must 
tacitly presuppose that ‘the eye’—unaided by manual operations with 
tangible instruments—is able to measure the angles subtended by the visual 
appearances. This it can certainly do if it can keep track of its own rotations 
and can match them with the displacements of the centre of the visual field. 
Reid’s ideal eye—which we ought to imagine, I guess, as freely rotating in a 
static, transparent socket—may just as well be endowed with this faculty, 
whose exercise and application to the ‘geometry of visibles’ must, however, 
contaminate the latter with haptic features. Evidently Reid’s construction 
depends, unbeknownst to him, on the simple mathematical fact that, under 
the stated assumptions, the group SO(3) of rotations about a fixed point in 
Euclidean 3-space induces a spherical geometry in the visual field. A 
motionless eye, surrounded by a topologically compact, simply connected, 
two-dimensional visual field, would have no grounds for bestowing on it a 
spherical geometry or indeed any metric at all—unless some further 
assumptions are made. 
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Luneburg sought to ascertain the exact implications of two-eyed vision for 
the geometry of visual space. He purportedly achieved this aim by analysing 
the results of some clever experiments, in which the subject is placed in a 
dark room before diverse configurations of luminous points. While ‘motion 
is avoided by fixing the subject’s head in a headrest and he is exposed only to 
static stimuli’, ‘no artificial restriction is placed on the subject’s binocular 
function’; he ‘makes his observations by allowing his eyes to rove at will over 
the entire range of the stimulus configuration until a stable perception of the 
geometry of the situation is achieved’ (Blank [1953], p. 717). In one of the 
experiments—designed and carried out by A. Blumenfeld as early as 1913— 
the subject is presented with two rows of luminous points, symmetrically 
disposed on either side of him, by pairs, on the horizontal plane of his eyes. 
The subject can move each light along a horizontal line perpendicular to the 
vertical plane dividing his left from his right side. He is asked to leave a given 
pair where it stands, and to reorder the others so that he gets to see either (a) 
a distance alley, t.e. two equidistant rows of luminous points receding from 
him; or (b) a parallel alley, i.e. two straight rows, perpendicular to the 
vertical plane dividing his front from his back. If the geometry of binocular 
vision were Euclidean arrangements (a) and (b) ought to coincide. The 
discrepancy between them—as noted by the experimenter in his own haptic 
space—allegedly proves that the subject’s visual space is not Euclidean but 
hyperbolic. 

Troubled by the extreme artificiality of Blumenfeld’s experiment, Battro 
and his associates repeated it under much more liberal circumstances. The 
subject, though still bound to a chair, was allowed to move his head, not just 
his eyes. He was placed in the open air, by daylight, facing an alley of up to 
240 m long and 48 m wide, formed by 5 to 12 pairs of yellow wooden staves 
1 mor 1.5 m high. The two staves at the end of the alley were fixed and the 
subject was requested to tell the experimenter how to move the remaining 
staves to the left or to the right, until the subject saw either a ‘parallel alley’ 
or a ‘distance alley’, as defined above. Again the discrepancy between both 
arrangements was supposed to reveal the geometry of the visual field, but, 
‘contrary to commonly accepted results that binocular visual space is 
homogeneous and has a negative curvature, and therefore a Lobachevskian 
geometry’, Battro and his associates ‘observed that the regular alleys gave 
the three possible curvatures of space depending on their size, the particular 
setting and the subject’; whence they concluded ‘that a general Riemannian 
geometry of variable curvature would be more compatible with the 
individual visual data’ (Battro et al. [1976], p. 14). Their tabulated results 
lead me, however, to a somewhat different conclusion. For what I read, for 
instance, in their Table 6 is that for different choices of a subject and of a 
setting of the fixed staves, the measured curvature of binocular visual space 
took one or the other of the five constant values 1.0, 0.5, 0, —0.5 and — 1.0. 
Since the method followed does not yield the curvature of visual space at one 
or more select points, but—if anything—its global curvature over a region 


14 Roberto Torrettt 


containing the staves, the stated results cannot be said to characterise a single 
space of variable curvature, but—if anything—several spaces of diverse but 
constant curvature.+ Hence, to my mind, the visual geometry disclosed by 
such experiments is not a feature of reality to which one might have to adjust 
one’s conduct in life, but an artifact of the experimental procedure. 

Be that as it may, it is worth noting that if a ‘geometry of visibles’ wins 
general acceptance and we are finally led to agree that visual phenomena 
belong to a space of their own, which is structurally incompatible with the 
one in which haptic phenomena are localised, we shall have to learn to say 
that a batsman who sees a ball fly towards him should try to hit, indeed, that 
same ball, but in a different space. Although I do not dispute that intellectual 
progress can sometimes be secured only through language reform, I refuse 
to believe that such a strained use of the words ‘same’ and ‘different’ could 
ever be justified, let alone sustained. 


7 OUR UNDERSTANDING OF THE PROCESS OF 
OBSERVATION 


In observation the observer grasps his own body in physical interaction with 
the objects observed. This is a permanent ingredient of observational 
awareness, at least where haptic perceptions are at play. Such being 
throughout our lives virtually always the case, it is no wonder that, in 
ordinary usage, the statement that a person x observes a thing or event y 
implies the statement that y causes x to be in a state in which he succeeds in 
observing it. Indeed this usage extends to all modes of observation, visual, 
auditive, etc. even where no reference is made to haptic awareness. I do not 
take this linguistic practice to mean that, say, purely visual observations—if 


t The matter is more involved than my hasty discussion suggests. Let A and B denote the 
positions of the fixed lights or staves in the alley experiment. In both the dark room and the 
open air version, the sought for curvature of visual space is determined by comparing, in the 
experimenter’s haptic space, a pair of lines through A and B whose images in the subject’s 
visual space are equidistant from each other, with a pair of lines through A and B, coplanar 
with the former, whose images in the subject’s visual space are straight and perpendicular to 
the segment AB. To be meaningful, the comparison must presuppose that ‘equidistant’, 
‘straight’ and ‘perpendicular’ describe definite features of the subject’s visual experience, s0 
that, when he redistributes the lights or the staves in agreement with such predicates he is not 
guided merely by an estimate he would do well to verify with a measuring rod or some other 
haptic means, but by a purely optic perception of lengths and angles, If this somewhat 
venturesome presupposition is allowed, the comparison can yield a definite value k for the 
overall Gaussian curvature of the visual surface corresponding to the haptic plane on which 
the lights or staves are placed. The procedure only makes sense if the said Gaussian curvature 
is constant. k is relevant to the geometry of visual space if and only if the said visual surface is 
totally geodesic. In that case, & is, at each point of the visual surface , one of the local sectional 
curvatures that enter into the construction of the Riemann curvature tensor of the visual space 
at that point. (Cf. O’ Neill [1983], p. 104, Def. 12; p. 101, Cor. 6; p. 79, Cor. 42.) But, since Ris 
only one of the local sectional curvatures, it does not suffice to determine the Riemann tensor, 
unless we assume that the curvature of the visual space is constant. This assumption, of 
course, was made and argued for by Luneburg and Blank, but is avoided and finally rejected 
by Battro and his associates. The meaning of their measurements is thereby obscured. 


Observation 15 


there are any such—must involve a claim to being caused by their objects 
(except perhaps when they are on the verge of being painful due to excess of 
light, in which case vision becomes propioceptive like touch and kinaes- 
thesia).! But visual observations are made by us, men and women of flesh 
and blood, who must sit or stand or walk or run or turn or stoop or stretch or, 
at the very least, strain our eyes to see. Haptic awareness is thus pervasive 
and discloses, in one way or another, that we are committed to the physical 
world. Our everyday handling—holding, pressing, pulling, pushing, 
twisting——of all sorts of bodies, and our continual exposure to bumping and 
falling, heat and cold, wind and water, light and noise, furnish the 
prototypes of our original notions of physical existence and physical action. 
It is therefore very unlikely that we shall ever find occasion of rejecting our 
grasp of ourselves as bodies interacting with other bodies. 

Yet, while men have never seriously hesitated in their grasp of observation 
as a physical process, their general understanding of such processes has 
undergone great changes. For example, Aristotle conceived of a manner of 
physical action that was designed to account for perception and observation. 
By virtue of it, the constitutive ‘form’ of the observed object could be 
transmitted ‘without matter’, through an appropriate intervening medium, 
to the ‘sensitive faculty’ of the observer. (For a curious example of how the 
observer’s properties, in turn, would also be communicated ‘without 
matter’ to the object of observation, see Aristotle, De Somnibus, 459°27 ff.) 
This doctrine was taught at school to the founders of modern science, who 
later rejected it and replaced it by a different conception of physical action 
which in part revived pre-Aristotelian notions. Towards the end of the 
seventeenth century the new conception had taken such hold of the best 
minds in Europe that, for example, John Locke ‘found it impossible to 
conceive that body should operate on what it does not touch [. . .], or when it 
does touch, operate any other way than by motion’.? Whence, when he comes 
to consider ‘how bodies produce ideas in us’, he declares that it ‘is manifestly 
by impulse, the only way which we can conceive bodies to operate in’.? This 
early modern idea of physical action was considerably modified by 


1 J, R. Searle [1983], p. 124, n. 9, proposes the following thought experiment to help remove the 
doubts of ‘many philosophers’ who are prepared to agree with him that ‘causation is a part of 
the experience of acting or of tactile and bodily perceptions’, but ‘do not concede that the 
same thing could hold for vision’: ‘Suppose we had the capacity to form visual images as vivid 
as our present visual experiences. Now imagine the difference between forming such an image 
of the front of one’s house as a voluntary action, and actually seeing the front of the house. In 
each case the purely visual content is equally vivid, so what could account for the difference? 
The voluntarily formed images we would experience as caused by ua, the visual experience of 
the house we would experience as caused by something independent of us.’ But evidently the 
alleged conclusion of Searle’s thought experiment follows only if we beg the question and 
assume that visual images must be experienced as caused. Otherwise, the most we can 
conclude from Searle’s conditions is that involuntarily formed visual images would be 
experienced as not caused by us. 

2 Locke, An Essay Concerning Human Understanding, editions 1 through 3, I. viii. 11. 

3 Locke, An Essay . . ., 4th edition, I. viii. 11; of. IV. ii. rr. 
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successive generations of natural philosophers, first by the eighteenth 
century theorists of instantaneous action at a distance, then, in the 
nineteenth century, by the creators of field theory. One capital ingredient of 
it survives, however, to this day: for us, as for Descartes, Boyle, Huygens, 
etc., all physical action boils down to a transfer of momentum—or, as we 
would rather put it now, of 4-momentum. 

The modern philosophy of nature has presided over great advances in the 
physiology of perception. It has also been associated, from its inception, 
with the modern development of means and methods of impersonal 
observation, which not only have tremendously expanded the scope of our 
knowledge, but should also help us, through our growing familiarity with 
them, to achieve a better grasp of the nature of personal observation. On the 
other hand, the modern idea of physical action has burdened us—also from 
its inception—with the so-called mind-body problem. For, as the seven- 
teenth century occasionalists were quick to see, transfer of momentum will 
neither account for nor be explained by a change of mind. All the attention 
devoted to the problem since Descartes has not brought us any nearer to 
understanding how a man’s decision can initiate a definite outward flow of 
energy and momentum across his skin, or how an inward energy- 
momentum flow across it can modify his state of awareness. And we still do 
not know how to coordinate our particular states of awareness of directly 
observed objects with any well defined, particular effects of the action of 
such objects on our bodies. It is unlikely that this rift between the two sides 
of observation can be closed without some radical, incalculable innovations 
in our understanding of physical action. Since our current understanding 
lies at the heart of so much valuable knowledge, there is little inducement to 
change it. 

Even if our present understanding of the observation process is thus 
limited and beset with difficulties, we are deeply committed to it, and we 
cannot well imagine how some of its implications could be denied. Thus it 
seems clear that, no matter how we conceive physical action, in every 
observation the observed object interacts with a receiver. Such interaction is 
critical to the acquisition of knowledge by observation, for the observer 
cannot ascertain any more features of the observed object than become 
discernible to him through their recorded effect on the receiver. Indeed, a 
state of the receiver can furnish information about a feature of the object 
observed only to the extent (and within the range of ambiguity and 
imprecision) that the said feature is, under the circumstances, a necessary 
condition for the attainment of that state. The receiver’s ‘power of 
resolution’, its capacity to separate—or its tendency to blur—the imprint of 

different attributes and states of the object, is a measure of its cognitive 
‘value. From this point of view, impersonal observation, carried out by 
means of an increasingly varied and efficient panoply of precision instru- 
ments, enjoys a distinct advantage over personal observation, although it 
cannot come to fruition without the latter. 
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8 PERSONAL VERSUS IMPERSONAL OBSERVATION 


The philosophical posture of modern science demands that nature be all of 
one piece. Observation processes have no doubt their peculiarities, without 
which they might not serve their purpose, but they are not specifically 
different, as natural phenomena, from other physical processes which are 
not observational. Observational interaction instantiates the same types and 
is governed by the same laws as ordinary physical interaction. Indeed, the 
development of impersonal observation in the modern age could only get 
under way on the understanding that such was the case. Observation devices 
exploit known properties of well typified natural processes for the sake of 
collecting information. Inference from the state of the receiver to the state of 
the observed object must rest on our knowledge of those properties, and can 
therefore hold good only if observation processes are not, physically 
speaking, a class apart. 

Nonetheless, observation processes do differ from their non- 
observational analogues in that they are ordered to an end: they are always 
embeddable in a quest for knowledge. It is a requisite of this teleological 
order that, among the many factors that contribute to a physical process of 
observation, some should stand out as the objects of observation and their 
observed features, while others constitute the receiver and its data- 
recording states. . 

In impersonal observation, the receiver is usually artificial and is singled 
out by its human manufacturer. It is expressly designed to register the 
interesting effects of the intended object of observation, which has been 
previously singled out by some human research project. Since the object- 
receiver interaction is nevertheless immersed in nature’s flux, great 
ingenuity must usually be devoted to filtering out the ‘noise’ that hinders the 
clean flow of information from the object to the receiver. The status of these 
several items is indeed notional, and depends on the epistemic project which 
the observation is meant to serve. (Cf. Pickering [1984].) The cognitive 
significance of the receiver’s states is a matter of interpretation, depending, 
of course, on the circumstances of the observation, but also, decisively, on 
the observer’s intelligence of the experimental situation. On the frontier of 
research such intelligence is apt to be flimsical. Thus, for example, the 
negative result of Michelson’s famous attempt to measure the relative 
motion of the Earth and the ether was understood to indicate (a) that the 
ether is dragged by the Earth’s atmosphere, the protective box in which 
Michelson’s apparatus was contained, etc. (this was Michelson’s own 
conclusion in 1881); (6) that motion of the apparatus across the ether 
modifies the molecular forces that hold its parts together, shortening one of 
its perpendicular beams while merely narrowing the other (this was 
independently suggested by Fitzgerald and Lorentz); and (c) that we live ina 
Minkowski spacetime in which light pulses tn vacuo follow null worldlines, 
so that the speed of light measured on an inertial lab in which time is defined 
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by Einstein’s method is the same in every direction. (Cf. also the example— 
beautifully analysed by Shapere [1982]—of the observation of the Sun’s 
interior by means of neutrinos, in which the object and the instrument of 
observation are delicately suspended in a tenuous spiderweb of theories.) 

While the cognitive aim of an impersonal observation supervenes on its 
underlying physical processes by the initiative of men, teleology is, so to 
speak, endogenous to personal observation. Here the receiver has not been 
segregated from the mainstream of nature for fact-gathering purposes by an 
external agency, but has grown of itself into a distinct, fairly stable physical 
system, suitably disposed to pick out specific effects of its interaction with 
specific objects. The cognitively relevant receiver states are not presented on 
a dial to the observer’s interpretative acumen, but translate spontaneously 
into observational awareness. The objects of personal observation do not 
have to be inferred from the states they induce in the receiver, for they are 
simply and straightforwardly perceived. In fact, it is rather from his direct 
awareness of them that the observer eventually learns—by inference—what 
receiver states are instrumental to their observation. Thus we have come to 
know that—though we are still quite incapable of explaining how—the 
recorded difference of less than 1/3000 s between a sound’s arrival in our left 
and in our right ear is the source of our awareness of the direction from 
which the sound came; that our visual awareness of the volume of nearby 
bodies rests on the slight difference in the optical input from such bodies 
into each one of our eyes; that our sense of balance and orientation in the 
gravitational field in which we live depends on the flow of liquid along the 
sensitive walls of the semicircular canals in the internal ear. 

We normally have a more or less definite grasp of the direct objects of our 
personal observations, and of their relations of place and time and in some 
cases also of their causal relations with our bodies. This grasp is the source 
from which the theory and practice of impersonal observation ultimately 
draw their sustenance and motivation. Thus, personal observation may 
justly claim metaphysical priority over impersonal observation. But that 
does not bestow on it an epistemic privilege against the latter. For personal 
observations and the ‘natural’, unreflecting grasp of things that goes with 
them are both fallible and corrigible, and are being continually rectified and 
qualified, not only by mutual comparison, but also in the light of impersonal 
observations. Thus, we habitually compare the readings of outdoor 
thermometers or of wristwatches with the estimates of air temperature or of 
time elapsed based on our feelings; a practice which not only serves to 
control and to correct such estimates, but can also contribute significantly 
to improve their accuracy. Personal observation is not only not superior to 
impersonal observation as a source of knowledge about physical objects, 
but, in both scope and precision, it is on the whole markedly inferior. The 
confusion that still prevails in some philosophical circles on this fairly 
obvious and simple matter can only be attributed to a vicious craving for 
certainty. This, of course, will never be satisfied by impersonal observation 
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and its intricate scaffolding of theories. But neither can it be quenched by 
contracting one’s knowledge claims to the bare subsistence level of common 
sense judgments and direct perceptions. 

Personal observation is, of course, always required for the cognitive 
fulfilment of impersonal observation. It may therefore seem surprising that 
the latter can be more precise and reliable than the former. For a system for 
the transmission of information cannot perform better than its weakest 
component. The solution to this apparent paradox is not far to seek. Human 
sensors are not equally deficient for all tasks. They are rather bad at 
discriminating weights, or temperatures, or light intensities, and they are 
utterly incapable of detecting small changes in atmospheric pressure; but 
they are pretty good at reading clearly printed digits, and may be trusted to 
record a coincidence between a pointer and a thin black line on a white dial. 
Observation devices are designed to translate the often imperceptible effects 
of the observed object on the receiver into easily discernible digital or 
pointer readings. That the outcome of impersonal observations should thus 
ultimately appear in the guise of personally observed data led some 
philosophers to think that a faithful description of such personal observa- 
tions in plain everyday language could give the full ‘cognitive meaning’ of 
the statements, couched in esoteric, ‘theoretical’ terms, in which scientists 
normally report their findings. Of course, in real life things stand just the 
other way around: digital and pointer readings get their distinctive 
interpretation from the theory of the respective instruments, and without it 
they all look quite insignificant and very much the same. 


g9 ON THE RELATION BETWEEN OBSERVED OBJECTS AND 
RECEIVER STATES 


No difference can be observed in an object which ts not recorded as a difference in 
the recetver. This principle is central to our current understanding of 
observation; and it does not seem possible to deny it, no matter how we 
revise or refine that understanding. Indeed, the principle is so deeply 
ingrained in our language that we would never be said to observe a change we 
know to occur in the object, but which our bodies and the instruments at our 
disposal do not reflect. 

It follows that in any personal observation receiver states must mediate 
between the observed features of the object and the observer’s perception of 
them. We are far from understanding the relation between those states, of 
which we are mostly unaware, and our awareness of the objective situations 
they disclose. That there is no simple correspondence between the 
information bearing states of our sense organs and any relevant states of 
mind can be readily gathered from the three examples on page 18. Only by 
sinking the cognitively significant receiver states deeper and deeper into the 
unexplored recesses of the brain could one hope to map them one-to-one 
onto the contents of our sense awareness. As neurology advances, such 
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terrae incognitae become increasingly unavailable, and one sees ever more 
clearly that a mind-brain isomorphism, if at all possible, could only be 
established on the basis of a thoroughly innovative, physically unorthodox 
description of the brain (cf. Searle [1983], p. 272). On the other hand, the 
relation between the said receiver states and the matching features of the 
object can be handled by the standard methods of physics. In this respect 
there is no essential difference between personal and impersonal observa- 
tion. And indeed virtually all progress in the physiology of perception, since 
Kepler first conceived the eye on the analogy of the camera obscura, has been 
achieved by treating the organs of sense as impersonal receivers. 

Object-receiver relations in personal and impersonal observations take 
varied forms and their study pertains to diverse fields of science. But they all 
share at least one common trait which must be considered in a philosophical 
discussion of observation. A receiver state conveys information about the 
presence of a certain feature in an object only in so far as this feature is a 
necessary condition of that state. 

Consider impersonal observation. We assume that there is no difficulty in 
classifying and recognizing observationally significant receiver states. 
However, a definite receiver state will not, as a rule, unambiguously point to 
an equally definite feature in the object. Such a state can normally arise due 
to diverse conditions, some of which need not even concern the intended 
object of observation. (Precision measurements can be greatly impaired by 
thermal variations in the instruments employed.) But even where such 
perturbing factors are negligible, the distinguishable states of the receiver 
may not suffice to discriminate between significantly different properties of 
the object. A grey shadow on a medical X-ray picture can reflect all sorts of 
conditions in the patient’s body. To judge what is actually disclosed by it, an 
observer must rely on his experience of similar X-ray pictures and on his 
general knowledge of medicine. A coupled pair of spots in a telescope 
photograph of a piece of sky is usually taken as evidence that in the direction 
of those spots there are two, possibly associated, astronomical light sources; 
but they might exceptionally be caused by a single source, if the beam of 
light it sends towards us is split, on the way to our telescope, by a 
gravitational lens. To decide that the latter is indeed the case an observer 
must carefully examine the circumstances in the light of gravitational 
theory. There are, indeed, plenty of cases in which the record of an 
impersonal observation tells an observer exactly what he wishes to know 
about an object, although he has no inkling of how the observation works 
and of what precisely is recorded by the receiver. Most of us ordinarily 
employ instruments of observation to learn about our surroundings in sucha 
thoughtless way. But we can do so only because a vast repertoire of object- 
receiver correlations has been firmly established by scientific and tech- 
nological research. Such research is all but thoughtless. It does not simply 
proceed by trying out any old instrument on a class of objects and setting up 
by straight-rule induction a correspondence between the alternative states 
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of the former and the interesting differences among the latter. The 
impersonal receivers in current use in all walks of life have for the most part 
been painstakingly developed in the light of scientific theories which entail 
certain necessary connections between diverse features of interest in our 
environment and directly observable receiver states. This is not the place to 
examine what type of necessity scientific theorising discovers—or induces— 
in nature. But one should bear in mind that impersonal observation is 
impossible without it. A particular receiver state can disclose a particular 
state of affairs only if the latter is, under the circumstances, a necessary 
condition of the former. To know this, one must grasp them both as 
instances of general types which stand in suitable relations of entailment to 
one another. Such typifications are not ready-made, but are the outcome of 
scientific thought. We may indeed unreflectingly profit from the impersonal 
observations with well-established significance which are taking place all 
about us. But we could not without reflection and theory-guided invention 
have brought them under way. There are some apparent exceptions—so 
called accidental discoveries—but ultimately they also confirm this rule. 
Thus, for example, a photographic plate stored together with a preparation 
of uranium salts from 27 February to 1 March 1896 inside a drawer of Henri 
Becquerel’s laboratory, which was exposed notwithstanding the absence of 
light in the drawer, recorded the first observation of radioactivity; but it took 
Becquerel’s alertness and preparedness—-he himself had mounted the 
uranium salts on the plate to study their phosphorescence under sunlight, 
but no sun shone on Paris in those winter days—to grasp as an observation 
record what another one would have discarded as a spoilt plate. 

Personal observation stands, physically, under the same principles as 
impersonal observation. A person cannot become aware, by observation, of 
a change in an object unless the latter effects a change in his body. A state ofa 
human body can convey information about a feature of its surroundings only 
to the extent that this feature is a necessary condition of that state.! 
However, not every state of the body is a source of observational awareness; 
nor do those which are disclose every one of their necessary conditions. 


1 The following example might suggest that the above requirement is too stringent. 
Unbeknownst to me, John, the postman who brings the mail to my neighbourhood, has an 
identical twin, Jack, who also works for the Postal Service. Suppose that this morning, as I 
went out of my building, I saw a postman across the street, whom I immediately took to be 
John, and indeed John he was. One could then say that I knew John as soon as I saw him, and 
that the information was conveyed by my eyes. Yet John’s presence across the street was not a 
necessary condition of the state of my optical receptors when I looked at him from my door, 
for a state indistinguishable from it would have been effected by the presence of Jack, in the 
same uniform and posture, at the same spot. Of course, had Jack been there, my brain would 
have reacted in the same way to the visual stimuli, and I would have mistaken him for John. I 
do not think, however, that the example proves that the requirement of necessity I stated in 
the text is excessive, What it shows, to my mind, is that the information conveyed by my eyes 
when I see the postman across the street suffices, at most, to establish that he is either John or 
Jack, and that my correct perception of him as being John goes beyond that information and 
involves a happy guess (aided, of course, by my ignorance of Jack’s existence). 
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Observational awareness is selective: the observer’s attention, guided by his 
interests and preconceptions, falls at any given time only on a small part of 
the current range of his consciousness. Observational awareness is self- 
transcending: it is no mere epiphany of organic states, but the grasp of an 
object against the background of a world. Hence, while in impersonal 
observation the facts of the matter must be inferred from a suitable 
description of the receiver states in the light of scientific theories and a 
general assessment of the circumstances (or by means of the ‘inference 
tickets’ provided by the user’s manual that comes with the instrument), in 
personal observation the actual presence of such-and-such an object is not a 
conclusion that needs to be drawn deductively or inductively from the 
momentary state of our body, for we are, so to speak, pre-programmed to 
jump to it straight away. (Cf. Fodor [1984].) The observer’s grasp of the 
object can be rectified to comply with earlier or further experiences, with 
scientific theories, or even with philosophical criticism. But it cannot be 
suppressed from observational awareness without destroying the latter’s 
observational character. Thanks to his immediate grasp of the environment 
in which his body is placed, the human observer develops an understanding 
of observation as a physical process and increasingly sophisticated theories 
about object-receiver links. Such theories are not required to get personal 
observation going—indeed, they would not even be possible if observational 
awareness did not precede them—but they certainly change our grasp of 
what we observe personally. Their might is demonstrated by the total 
incapacity of this writer and—presumably— of the reader to see ghosts and 
to hear voices from another world, abilities which are not uncommon among 
people who have a different understanding of light and sound. 
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The Physical Content of 
Minkowski Geometry 


by BRENT MUNDY! 


The standard coordinate-based formulation of the space-time theory of special 
relativity (Minkowski geometry) is philosophically unsatisfactory for various 
reasons. We here present an explicit axiomatic formulation of that theory in terms of 
primitives with a definitive physical interpretation, prove its equivalence to the 
standard coordinate formulation, and draw various philosophical conclusions 
concerning the physical content and assumptions of the space-time theory. The 
prevalent causal interpretation of physical Minkowski geometry deriving from 
Reichenbach is criticised on the basis of the present formulation. 
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x INTRODUCTION 


The space-time geometry of special relativity (Minkowski geometry) is 
ordinarily presented in the literature of physics with the use of space-time 
coordinates x, y, x, t. These coordinates are given a physical interpretation 
by specification of procedures to be employed in assigning numerical co- 
ordinate values to any given event. These specifications are given in 
informal terms: the spatial coordinates x, y, z are to be determined with the 
use of rigid rods, and the time coordinate t is to be determined with the use of 
a clock located at the origin, together with the well-known Einstein 
procedure for the determination of distant simultaneity. The primary 
physical content of the space-time theory of special relativity is then taken to 
pertain to the relations among the coordinates assigned to given events by 
the coordinate systems associated with frames of reference in uniform 
relative motion, these relations being given by the Lorentz transformations. I 
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1 This paper was written while the author held a Mellon post-doctoral fellowship at the 
University of Pittsburgh. Its content derives primarily from the author’s doctoral disser- 
tation, Mundy [1982]. 


26 Brent Mundy 


will refer to this as the standard formulation of the space-time theory of 
special relativity. 

There are several respects in which the standard formulation may be 
considered as inadequate or misleading, from a philosophical viewpoint. In 
the first place, it leaves some uncertainty as to what the theory is a theory of. 
Taking the standard presentations literally, it seems to be a theory of 
coordinate systems and their properties and relations. This is somewhat 
disturbing, since a coordinate system is, after all, an arbitrary and artificial 
human construct, part of our conceptual apparatus for the description of 
nature, rather than a proper part of the subject matter of physics itself. An 
uncritical reader may easily draw the conclusion that the content of the 
theory is at least partially epistemological in character, rather than 
straightforwardly physical. Indeed, even some physicists (e.g. Eddington) 
have drawn this conclusion, being led thereby to a somewhat idealistic 
interpretation of the theory as a theory of ‘observers’. 

If one wishes to reject the idealist interpretation, it is necessary to specify 
some objective physical subject matter for the theory. The most natural 
suggestion would be to take the subject matter of the theory as comprising 
those types of physical fact or process which figure in the construction of a 
coordinate system of the standard type. The theory would then be taken as 
an objective physical theory of those types of physical fact or process, 
independently of any epistemic use to which they might be put. This is in 
fact the course which I shall follow in this paper. There are two prima facie 
difficulties. First, it is not clear precisely what physical facts or processes 
these are. This is another aspect of the uncertainty of the subject matter in 
the standard formulation. Second, it is not clear precisely what laws 
governing these physical facts or processes the theory is to be taken as 
proposing, and by what observations these laws are to be taken as supported. 

What is required, therefore, is something of the nature of an axiomatic 
formulation of the space-time theory, in which a definite set of primitive 
terms is specified, each having a definite physical interpretation. This will 
fix the physical subject matter of the theory. The axioms in turn will express 
the physical laws which the theory is taken to propose concerning the 
physical facts or processes which provide the interpretation. Equivalence 
with the standard formulation will be shown by proving on the basis of the 
axioms that space-time coordinates may be introduced in the standard 
fashion, and will have the properties ascribed to them by the standard 
formulation. This will be done below. 

There is an extensive literature, both philosophical and mathematical, 
concerned with axiomatisation of Minkowski geometry.’ Rather remark- 
1 Within the philosophical literature the most influential axiomatisations have probably been 

those of Reichenbach [1924] and Meblberg [1935]. The pioneering work of Robb [1914-36] 
has recently been much discussed by philosophers: cf. several of the papers in Earman et al. 
(eds.) [1977]. An axiomatic development similar in character to Robb’s but much simpler will 


be found in Mundy (a). Schutz [1973], [1981] and Stachel [1983] give extensive references to 
the mathematical and physical literature on axiomatisation. 
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ably, much of this literature has been dominated by analyses in which the 
concept of causation plays a central role as a primitive notion. Writers in this 
tradition have indeed provided answers to the two questions raised above, 
concerning the physical subject matter and basic assumptions of the space- 
time theory. They have identified as part or all of the subject matter of the 
space-time theory the general relation of causation, or the relation of 
possible causal connection between two events, and they have identified as 
basic assumptions of the theory certain propositions about causal relations 
in general, such as the proposition that no causal influence may propagate 
more rapidly than light. 

This is remarkable because the term ‘cause’ does not even appear in 
Einstein’s 1905 paper, and there is no obvious appeal to any general 
concept of causation among the arguments presented there. The actual 
arguments are entirely concerned with the properties of coordinate systems, 
and Einstein makes quite explicit statements concerning how the coordinate 
systems are to be constructed, using rods, clocks, and light rays. If we follow 
the above suggestion of seeking for the physical subject matter of the theory 
among the types of physical fact which are involved in the construction of a 
coordinate system of the type referred to in the ordinary informal 
presentations, we are led to consider facts concerning rigid bodies, clocks, 
and light rays. Nothing at all directs our attention toward facts concerning 
causation in general. 

The thesis of the present paper is that this first impression is in fact 
correct: that the space-time theory of special relativity zs in fact a theory of 
the behaviour of rigid bodies, clocks, and light rays (and of inertial motion). 
Consequently I am asserting that the long philosophical tradition of causal 
interpretation of the space-time theory (Reichenbach, Mehlberg, Carnap, 
Griinbaum, van Fraassen, Salmon) is mistaken.! 

The main part of the present paper presents an explicit axiomatic analysis 
of the space-time theory in terms of the types of physical fact which I 
identify as its subject matter. In contrast with the central role played by the 
limiting character of the velocity of light in most causal analyses, the essential 
physical assumption of the theory on my analysis is the constancy of the 
velocity of light for all inertial frames. To emphasise this point I have 
presented an axiomatisation in which all of the axioms except that 
expressing the constancy of the velocity of light are true in pre-relativistic 
optics and space-time theory. 

The final section of the paper argues directly against the causal 
interpretations. I accuse Reichenbach and his followers of a misunderstand- 
ing of the physical significance of Einstein’s definition of simultaneity, and 
of equivocating between two different notions of simultaneity. 


1 The present paper was written before I had seen the paper Nerlich [1982], which also 
criticises the causal interpretations of Minkowski geometry, from a somewhat different 
viewpoint and using different arguments. Basically I agree with Nerlich’s criticisms of the 
causal interpretations, but disagree with his suggested alternative interpretations. I do not 
have room here to go into the disagreements in detail. 
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2 THE PROJECT OF AXIOMATISATION 


I shall begin with some remarks concerning the aim and intended 
significance of the axiom system to follow, hoping to avert some possible 
misunderstandings. Recall first that the goal is to analyse the space-time 
theory as a physical theory. We are not considering Minkowski geometry 
merely as a mathematical object, but rather as a mathematical object which 
is already endowed with at least one physical interpretation (possibly a 
distinguished one) in terms of a particular set of concepts definable within it. 
(Such an interpretation must exist because physicists are in fact using 
Minkowski geometry as part of a physical theory, deriving empirical 
predictions using it, etc.) The purpose therefore is not merely to find some 
set of primitive concepts and axioms which shall suffice to determine the 
structure of Minkowski geometry as a mathematical object, but rather to do 
so in a way which is as faithful as possible to the existing physical 
interpretation of the theory. 

This goal places constraints upon the selection both of primitive concepts 
and, to a lesser degree, of axioms. The selection of primitive concepts is 
strongly constrained because we wish to take as primitive precisely those 
concepts definable within the standard formalism which form the basis for 
the existing standard physical interpretation of that formalism. Since that 
interpretation is based upon a particular specified procedure for the 
construction of a coordinate system (using rods, clocks and light rays), we 
must look for primitive concepts which represent as accurately as possible 
the types of physical fact which are involved in that constructive procedure. 
This is the topic of section 3. 

Therefore (in contrast with many axiomatic endeavours) there is no 
question here of aiming for formal simplicity or economy in the choice of 
primitive concepts. It is entirely possible, and I believe true, that the 
existing physical interpretation of Minkowski geometry involves several 
distinct physical concepts, and that the theory is therefore most accurately 
formalised using a corresponding number of primitive terms. 

The standard coordinate-based theory will determine an infinite set of 
propositions expressible in terms of the chosen primitives, the totality of 
which constitute the physical content of the theory under its physical 
interpretation in terms of those primitives. The initially-posed problem of 
specifying the physical subject-matter and content of the theory is solved 
already just by the specification of the primitive concepts, their physical 
interpretation, and the infinite set of propositions expressible in terms of 
those concepts which the theory comprises. The specification of that infinite 
set of theorems as the deductive closure of a particular finite subset of them 
designated as axioms is not essential for determination of the physical 
content of the theory. Thus my later argumentation that various causal 
propositions are not part of the physical content of the theory also does not 
require such an axiomatic presentation. : 
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Axiomatisations possess some interest nonetheless, since the different 
propositions asserted by the theory are not all eptstemtcally equivalent. 
There are at least four desiderata for axioms within a proposed formal 
reconstruction of a physical theory, in addition to the basic requirement that 
they yield the proper set of theorems. First is the obvious one of 
mathematical simplicity. Second is physical significance: an axiom should 
express, under the intended interpretation of the primitive concepts, a 
proposed physical law with a reasonably straightforward and testable 
physical content. Two further desiderata emerge when one considers the 
position of the theory under study within a historical context of develop- 
ment and within historical and logical contexts of competition and 
comparison of alternative theories of the same subject-matter. Third is 
historical accuracy: it is of interest to find axioms whose physical content was 
explicitly or implicitly accepted by a particular community of users of the 
theory (such as its founders), and which figured in their own logical 
elaboration of the theory. Fourth is utility for comparison: axiomatisation 
may shed light upon the logical relations among related theories (e.g. as with 
non-Euclidean geometry and the parallel axiom). 

I have asserted that the ‘causal’ theorists such as Robb, Reichenbach and 
Mehlberg are mistaken as to the physical subject-matter of the space-time 
theory. If so, this implies that their axiomatisations are deficient in at least 
the third respect: the causal principles proposed as axioms cannot have been 
important historically if causality was not part of the intended subject- 
matter of the theory. In addition to identifying a different set of primitive 
concepts, I here put forward an explicit axiom system in terms of those 
concepts as an alternative to the causal ones. This serves several purposes. 
First, it shows that this set of concepts does indeed suffice for the 
development of the standard space-time theory, and hence that this general 
type of axiomatic approach is a viable alternative to the causal one. Second, I 
hope to satisfy several of the desiderata listed above more fully than the 
causal axiomatisations succeed in doing. In addition to starting from what I 
think to be the correct set of primitive concepts, most of the axioms seem to 
me to express reasonablly straightforward physical propositions which 
would have been accepted as true or at least plausible by the early adherents 
of the theory. Thus I hope that the present axiom system possesses greater 
physical significance and historical accuracy than the causal axiomatis- 
ations. I should stress, however, that I do not aim at exact formal 
reconstruction of any one sequence of reasoning such as that of Einstein 
[1905]. 

The present axiom system has also been constructed so as to possess some 
utility for comparison. As I have stressed in Mundy [1983], it is possible to 
derive the Lorentz transformations from propositions which were accepted 
in pre-relativistic optics and space-time theory, together with the assump- 
tion of the constancy of the speed of light. Thus there is a sense in which that 
is the sole new law asserted by the space-time theory; the rest is merely a 
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matter of deciding which propositions of the old theory may still be 
maintained, and which must be given up. The present axiom system has 
been formulated so as to make this point explicit by using as axioms 
propositions which are true in pre-relativistic theory, together with an 
axiom asserting the constancy of the speed of light. 

This makes for a nice formal opposition between the present analysis and 
the causal ones: the latter take the limiting character of the speed of light as 
basic, whereas I take its constancy as basic. This difference again shows the 
greater prima facie historical accuracy of the present analysis, since of course 
the constancy of the speed of light was explicitly put forward by Einstein as 
one of his fundamental postulates, whereas the limiting character of the 
speed of light is not treated in any such fashion. (Later I shall discuss the 
arguments of Reichenbach and his followers that Einstein’s procedure 
‘presupposes’ the limiting character of the speed of light.) 

There is some formal connection between the present axiomatic approach 
and the mainstream of modern axiomatic foundations of geometry, stem- 
ming from Hilbert’s classical work of 1899. In that approach there are two 
primitive geometrical concepts: a three place relation B(a, b, c) of affine 
betweenness, being the relation that obtains between three points a, b, c when 
b lies between a and c on a straight line, and a four place relation C(a, b, c, d) 
of segment congruence, being the relation that the lengths of the two segments 
ab and cd are equal. These concepts are direct formal representations 
respectively of the two basic operations of application of a straight-edge or of 
a compass which were thought of as constituting classical synthetic 
geometry, and Hilbert’s axioms express basic geometrical laws concerning 
the outcomes of those operations. If classical spatial geometry is considered 
as an empirical theory of the geometrical properties of material bodies, 
Hilbert’s axiomatisation of Euclidean geometry constitutes a formalisation 
of its physical content much like what I here aim at for Minkowski 
geometry. 

The mathematical structure of Minkowski geometry is very similar to that 
of Euclidean geometry, the only difference being the dimension and the 
signature of the inner product. Therefore there is every reason to expect asa 
mathematical possibility that Minkowski geometry should be capable of 
axiomatisation in a manner bearing a strong formal analogy to the classical 
Hilbert-style axiomatisations of Euclidean geometry. Surprisingly, this 
possibility seems to have been investigated only rather recently.' Dorling 
has obtained decisive mathematical results concerning the derivation of 
Minkowski geometry from a minimal modification (a change of one 
congruence axiom) of a Hilbert-style axiomatisation of Euclidean geometry. 
The proofs of Dorling’s results are long and difficult. 


1 Jon Dorling has been conducting research in this area since around 1970, which however has 
not been published. A summary of this work is to appear as Dorling (a). The present work, 
drawn from Mundy [1982], was completed without knowledge of Dorling’s related 
investigations. The formulation of Stachel [1983] is also of interest in this connection. 
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The primitive concepts and some of the axioms of the present axiom 
system do also bear some formal analogy to those of the Hilbert tradition in 
foundations of Euclidean geometry. However, the aims and the results of 
Dorling’s and the present investigation are quite different. Dorling has 
carried out comparative axiomatic analyses of Euclidean and Minkowski 
geometry as geometrical theories formalised using the classical Hilbert 
primitives of betweenness and congruence, and has reached deep and 
interesting results. But Dorling’s focus is upon the mathematical structure 
of the two geometries as axiomatised in terms of these geometrical 
primitives, whereas my focus is upon the intended physical interpretation of 
Minkowski geometry as a physical theory. Dorling does not try in this way to 
stay close to the empirical and historical basis of the space-time theory. For 
example, an essential role is played in the physical argumentation by 
considerations concerning the properties of light, especially the law of the 
constancy of the speed of light for all inertial frames. No postulate 
corresponding to this law occurs in Dorling’s axiomatisations, and in 
Dorling’s axiomatic systems the distinguishing property of the family of 
lines corresponding to the null lines of standard Minkowski space is one 
which is of doubtful physical significance.! Furthermore the existence 
of lines having this property is for him a deep and important theorem, not 
an assumption. This is in striking contrast to the standard physical inter- 
pretation of Minkowski geometry, where the physical existence of a dis- 
tinguished family of lines corresponding to the space-time paths of light 
propagation is one of the basic physical assumptions upon which the 
space-time theory rests. 


! The property in question is that such a line contains segments which are congruent to an 
‘improper’ segment, f.e. a segment whose two ends coincide. Thus, quite aside from the 
problem of accelerating a rod or clock to the speed of light in order to measure intervals along a 
null segment, one requires rods or clocks of zero extension with which to carry out the 
measurement! Dorling notes the problems of physical interpretation of his congruence 
primitive as applied to optical and spacelike intervals. His solution is to abandon this 
application altogether and to construct a second axiomatisation based solely upon the 
application of congruence to timelike intervals, which haa a direct physical interpretation in 
terms of clocks. In this way he arrives at an axiomatisation still in terms of the Hilbert 
primitives, but whose physical interpretation refers only to clocks and inertial motion. (This 
contrasts interestingly with the formulation of Stachel [1983], which refers only to rods and 
inertial motion.) Dorling’s second formulation again avoids appeal to the physically crucial 
laws concerning the propagation of light. The particular axiomatisation again involves a 
minimal modification of a Hilbert-style axiomatisation for Euclidean geometry, and again the 
proof of equivalence with standard Minkowski geometry is long and difficult. (In un- 
published work conducted independently I have arrived at axiomatisations along the same 
lines using these primitives for which the proofs are not so difficult, but the axioms are also not 
sọ similar to those familiar from Euclidean geometry.) 

My own alternative response to what is essentially the same formal problem is to abandon 
the classical Hilbert primitives and the exact mathematical analogy to Euclidean geometry 
which is central to Dorling’s investigations. Instead I take a physically more realistic set of 
primitive concepts designed to represent accurately the physical basis of Minkowski geometry 
in the same way that Hilbert’s original primitives represent accurately the physical basis of 
Euclidean geometry in operations with straight-edge and compass. 
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The proof of the representation theorem for the present axiom system is 
very much simpler than for those of Dorling. This is to be expected, since all 
I aim to do is to make explicit one set of qualitative physical assumptions 
which might reasonably be taken to underlie the argumentation concerning 
coordinate systems put forward in the standard expositions of the theory. 
Since that argumentation is not very long or complicated, one should not 
expect an axiomatic reconstruction to be so. 


3 THE PRIMITIVE CONCEPTS 


The axiomatisation proposed here has five primitive concepts, correspond- 
ing to what I believe to be five distinct aspects of the standard procedure for 
the construction of a space-time coordinate system. The basic elements are 
point-events of space-time in the familiar idealised sense, henceforth simply 
called points, The five primitive concepts represent five distinct physical 
relations among points, each of which is manifested by a certain type of 
(idealised) physical fact or process. Except where noted it should be clear 
what mathematical relations among points of Minkowski geometry (de- 
finable in coordinate terms) correspond to these physical concepts. 

The first two relations are: the relation that point b lies between points a 
and c on the (possible) path of a light ray, called lightlike betweenness, and 
written as B,(a, b, c); and the relation that point b lies between points a and c 
on the (possible) path of a freely-moving’ mass-point, called timelike 
betweenness, and written as B,(a, b,c). These two relations represent two 
different types of physical fact or process (light propagation and inertial 
motion) occurring in connection with the physical interpretation of 
Minkowski geometry, each of which is such as to determine a class 
of linearly ordered sets of space-time points. The representation of each of 
these processes by a three-place primitive is somewhat conventional: what 
we observe are distinguished sets of points possessing a linear ordering, and 
it is known from the foundations of geometry that the formal properties of 
such sets may be conveniently represented by means of such a three-place 
betweenness relation satisfying certain formal laws (the order axioms of 
affine geometry, stated in the Appendix). Both relations satisfy these formal 
laws, since they both determine linear sets of points, but the physical 
processes themselves are very different, and therefore descriptive accuracy 
requires the use of two distinct betweenness relations as primitive. 

In the third place we have a relation of spacelike betweenness. In physical 
terms, however, there is no simple process which corresponds directly to the 
observation of what one would geometrically represent as betweenness of 
points on a spacelike line of Minkowski geometry. Rather, the correspond- 
ing physical observation is that three parallel timelike lines or segments are 
coplanar, and that one lies between the other two in their common plane. 
This is the natural geometrical representation of the information which is 
furnished by the application of a physical straight-edge to determine a 
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spatial betweenness. relation. The straight-edge is in uniform inertial 
motion, so that the material points along its edge are describing a family of 
parallel and coplanar lines, and the relation of spatial betweenness among 
the points along the edge determines a corresponding relation among all of 
those parallel timelike lines. If any three objects are coincident with points 
along the straight-edge for a finite period of time then they are also 
describing parallel timelike lines having betweenness relations correspond- 
ing to those of the points on the straight-edge. The information that this 
relation obtains is what I take to be furnished by observations using an 
unmarked straight-edge. This is a stx-place relation B,(a, a’, b, b,c, c) of 
spatial betweenness among the six points which define the ends of the three 
parallel timelike segments during the period of application of the straight- 
edge. This modification of the three-place betweenness relation of Hilbert- 
style foundations of geometry is required in order to represent formally the 
space-time use of a straight-edge to the same degree of accuracy with which 
Hilbert’s primitive represents the purely spatial use of a straight-edge.! 

In the fourth place we have a four-place relation C;(a, b, c, d) of temporal 
congruence, representing the physical relation that the time elapsed between 
two points a and b along the history of a freely moving mass point is the same 
as the time elapsed between two points c and d along another freely moving 
mass point, as these times would be measured by clocks moving with the 
mass points. This four-place relation of congruence of temporal segments 
follows the standard Hilbert pattern. 

In the fifth place we have an eight-place relation Cs(a, a’, b, b’, c, c', d, 7) of 
spatial congruence. Just as with spatial betweenness, this departure from the 
Hilbert pattern is occasioned by the fact that observations using a rod are not 
instantaneous, but rather require a finite period of time during which the rod 
and the objects measured are (approximately) in uniform inertial motion 
together. Therefore the information yielded by the successive application of 
a graduated rod to two different spatial intervals ab and cd is actually 
information concerning a pair of parallel timelike segments aa’ and bb’ 
(defining the motion of the objects forming the first spatial interval during 
the first application of the rod), and their spatial congruence to a pair of 
parallel timelike segments cc’ and dd’ (defining the motion of the objects 
forming the second spatial interval during the second application of the 
rod).1 Of course the second pair need not be parallel to the first pair, since 
the rod may be brought into a different state of inertial motion for the second 
application. Also the two pairs need not be temporally congruent, since the 


1 In order that the relation Bs(a, a’, b, b', c, c') should be precisely defined mathematically and 
physically it is necessary to make some conventional decisions as to how the end-points of the 
segments a, a’ etc. are to be selected. Eg. if they are to be unique, they might be taken as the 
end-points of the connected open or closed interval during which that part of the rod is co- 
moving with the object being measured. Alternatively, the relation might be defined to hold 
for any two distinct points a, a’ lying within that interval, or for any two points a, a’ on the line 
of motion. These details do not affect the development, and I will not bother about them here. 
Similar remarks apply to the congruence relation Cs defined below. 


c 
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rod need not remain in contact with the measured segments for equal time 
intervals in the two applications. 

These two congruence relations represent the two physically quite 
different processes employed to arrive at quantitative measures of spatio- 
temporal distance. As with the betweenness relations, the representation of 
these processes by means of four and eight place congruence relations 
respectively involves a certain amount of conventional choice, but does seem 
to provide a fair mathematical representation of what is determined by the 
actual processes of measurement.: The most common methods of measure- 
ment of either space or time require the construction of a graduated scale 
involving recurrence of the same (i.e. congruent) spatial or temporal 
intervals, and the measurement of other intervals by determining them to be 
congruent to a particular interval along the scale. Thus in both cases it seems 
clear that the information derived from ordinary measurement processes 
may be fully represented simply as information concerning the occurrence 
or non-occurrence of these congruence relations among particular n-tuples 
of points. 

All five of the above concepts are employed in the standard process of 
construction of a coordinate system as described e.g. in Einstein [1905], in 
the sense that determinations must be made of the applicability or non- 
applicability of each of these five relations to particular n-tuples of space- 
time points. This is explicitly stated for the relations B,, Cr, and Cy, since 
we are instructed to send light rays here and there, and to measure with 
clocks and rods the lengths of various spatial and temporal intervals. The 
relations Bs and By seem to be appealed to implicitly, since the ordinary 
processes of spatial and temporal measurement presuppose that the spatial 
lengths are laid out along a straight spatial line, and that the clocks and rods 
are at least approximately in free inertial motion. The present axiom system 
will demonstrate that no other physical concepts are involved, by deriving 
the standard space-time theory from laws expressed in terms of these five 
primitives. 

I am asserting that these five types of physical fact or process constitute 
the physical subject-matter and form the empirical basis of the space-time 
theory of special relativity. By this I do not mean to assert that they are 
‘purely observational’ terms in any mythical sense, but simply that they 
refer to definite types of physical fact which physicists know how to identify 
when they occur, at least in favourable cases, with a reasonable degree of 
accuracy and reliability.! Furthermore there seems to be no doubt that the 
ordinary procedures of spatio-temporal observation and measurement do 
determine instances of these five relations, in a reliable manner. Thus the 


1 ‘Accuracy’ here refers not only to possible errors of measurement, but also to the possibility 
that the specified relation does not exactly hold. Thus, we can reliably apply the notion of 
temporal betweenness to motions which are free of forces only to within a certain degree of 
approximation. 


The Physical Content of Minkowski Geometry 35 


identification of these relations as the physical subject-matter of the space- 
time theory has a strong prima facie plausibility. 

It should also be noted that the observation of these physical relations is 
not dependent upon the full development of the special relativistic space- 
time theory. The same physical relations occur as part of the subject-matter 
of pre-relativistic space-time theories, and the same experimental pro- 
cedures are prescribed for observation of them. I am particularly concerned 
here to compare Minkowski geometry with a classical space-time geometry 
which I shall call (following Mundy [1982]) Maxwell geometry. This consists 
of the classical space-time structure (often called neo-Newtonian space-time 
in the literature), together with the distinguished family of lines along which 
light rays are thought to propagate according to classical electrodynamics. 
These lines define at each point an affine cone with respect to the four- 
dimensional affine structure which underlies the classical space-time 
structure. In fact this structure of four-dimensional affine geometry plus 
light cone is exactly the same in Maxwell geometry and in Minkowski 
geometry. This structure is definable in terms of the concepts Br and By 
alone, and I shall refer to it as the inerttal-optical structure. (The two 
geometries also agree in respect of the straight-edge betweenness relation 
By.) The two geometries differ only in their congruence structure, i.e. in the 
ways in which the relations Cs and Cy are superimposed over the common 
inertial-optical structure. The differences will be made explicit below, and it 
will be shown that the constancy of the speed of light in Minkowski 
geometry is the essential difference. 


4 DEFINITIONS AND AXIOMS 


The relation B(a,b,c) of general linear betweenness is defined as the 
disjunction of Br and B,. It is known from the standard synthetic 
treatments of geometry in the Hilbert tradition that it is possible to develop 
affine geometry in terms of the three-place relation of betweenness on a line 
as the sole primitive. (For the convenience of the reader a brief sketch of the 
essential concepts of affine geometry and a complete set of axioms for four- 
dimensional affine geometry in terms of the relation B(a, b, c) are given in an 
Appendix.) In such a geometry it is assumed that any two points may be 
connected by a line. In physical space-time geometry, however, we do not 
know if this is the case. Therefore it is more appropriate to consider certain 
substructures of four-dimensional affine geometry, which may contain only 
some of the lines of full affine geometry. 

I will define a subaffine geometry of dimension » to consist of a set A anda 
three-place relation B(x,y,2) defined over the elements of A, which is 
isomorphic to the result of restricting the betweenness relation of standard 


1 Dorling’s second axiom system and those of the author mentioned in note 1, p. 31 are of this 
type. Another example is the axiom system of Mundy [a], in which only optical lines are 
present. 


36 Brent Mundy 


n-dimensional affine geometry A” to those triples x, y, z of which all three 
points lie on one of some designated subset S of the lines of A”. (Thus each 
line of the subaffine geometry has all of the points which it ought to have; 
only some of the lines of the full geometry are missing.) I will define a convex 
subaffine geometry to be one for which the designated subset S of A” is 
defined as follows: there is an n-dimensional convex subset C of A” and a 
fixed point p, and the set S consists precisely of the lines of A” which are 
parallel to the line pe for some point c in C. (Thus the rays from p through C 
form a convex cone with vertex at p, in the standard sense.) This type of 
subaffine geometry is an appropriate representation of the geometric 
structure of the By and B, system, since we have empirical evidence that for 
any two possible states of inertial or optical motion, all of the states of 
intermediate velocity in the same plane are possible. This is the essential 
defining property of a convex subaffine geometry. In the Appendix I 
describe how the notion of a convex subaffine geometry may be axiomatised 
in terms of the primitive relation B(a, b, c) of betweenness, in such a way as 
to allow all of the concepts of full affine geometry to be applied in an 
unambiguous way to the subaffine geometry. 


Axiom r (subaffine geometry): The relation B(a,b,c) defines a four- 
dimensional convex sub-affine geometry over the set of points. 


Various concepts definable in affine geometry such as line, plane, etc. will 
be used in the axioms to follow. They are understood to refer to the subaffine 
structure given by Axiom 1, They refer to those sets of points which 
correspond to lines, planes, etc. under the natural embeddings of the 
subaffine structure into a full affine structure. 

The physical content of Axiom 1 may be divided into three assertions. 
The first is a non-metrical consequence of Newton’s first law of motion. The 
first law states that the paths of freely moving mass-points (t.e. triples of 
points bearing the relation B,(a,b,c)) are such that the spatial intervals 
traversed are proportional to the times elapsed. This implies that these paths 
are straight lines in a four-dimensional affine geometry defined by the spatial 
and temporal coordinates. (This is easily seen to be true in classical 
mechanics.) Thus these lines form a subaffine geometry. The second 
assertion is that the lines defined physically by light rays are straight lines in 
the same affine geometry as that defined by the inertial lines. This follows 
from the law that light as well as force-free matter moves at a constant speed 
in a straight line. The third assertion is the ‘principle of intermediate 
velocities’ stated above, which is equivalent to the convexity of the subaffine 
geometry. Axiom 1 is true in Maxwell geometry, where the relation B holds 
among the triples of all lines except those lying in the hyperplanes of 
absolute simultaneity. These lines form a convex subaffine geometry. (One 
might also state as an affine axiom at this point the fact that the straight-edge 
betweenness relation By agrees with the inertial-optical affine structure; this 
is included in Axiom 4 below.) 
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Axiom 2 (affine homogeneity): If Byla, b, c) (respectively B,(a, b,c)), and 
B(p,q,r), and the line a,b,c ts parallel to the line p,q,r, then Brlp, q,r) 
(respectively B,(p, q,7r)). 


‘This axiom asserts that if one of the two linear processes is possible at one 
place or time along a certain affine direction, then it is possible at any other 
place or time along that same direction. By Axiom 1 we know that at least one 
of the two basic processes is possible for any given triple on each line; Axiom 
2 ensures that the same process will be possible all along the line. A line will 
be said to be of type T (respectively type L) if the processes By (respectively 
B,) are possible for the triples lying on it. A segment is assigned the same 
types as the unique line upon which it lies. Note that nothing prevents a line 
from being of both types; in Maxwell geometry the L lines are also T lines. 
The physical basis for the axiom is the observed homogeneity of space and 
time. The axiom is obviously true in Maxwell geometry. 

Parts (a}-(d) of the following axiom express the basic formal properties of 
the two congruence relations, as explained informally above. These have an 
obvious and well-confirmed empirical content in terms of rods and clocks, 
and are obviously true in Maxwell geometry. Part (d) ensures that the 
relation Cy is actually a relation between lines, not merely between points on 
them. We may define C(A, B, C, D) to hold among the four T lines A, B, C, 
D just in case there are segments aa’, bb’, cc’, and dd’ on the four lines 
respectively such that Cs(a, a’, b, b’, c, c’, d, d) holds. 

Part (e) is more subtle. Affine geometry allows for the determination of a 
certain weak congruence relation among segments on parallel lines, based on 
the principle that opposite sides of a parallelogram are congruent. Since the 
affine structure is defined by the relations By and B, while the metrical 
structure is defined by the physically distinct relations Cş and Cy, it is 
obviously a physical hypothesis that the affine and the metrical congruence 
relations agree with one another where they are both defined, i.e. on parallel 
segments. For Cs this follows from Axiom 4, while for Cy it is stated in Part 
(e) below. This physical hypothesis is very difficult to test empirically on 
Earth, owing to the difficulty of physically realising the affine configurations 
of inertial or optical paths which are required for the physical determination 
of relations of affine congruence. It is usually assumed without discussion 
that the affine and metrical congruence relations will agree. 


Axiom 3 (congruence): 

(a) Cy is an equivalence relation over segments of type T. 

(b) Cs ts an equivalence relation over unordered pairs of parallel segments of 
type T. 

(c) For any segment ab of type T and ray A of type T with end-point c, there 
is a unique point d on A with C7(a, b, c,d). 

(d) If Cs(a, a’, b, b',c, c,d, d) and xx’ is a segment on line a, a’ and z is a 
point on line c, c' then there are segments yy’ on line b, b’, gz on line c, c' and wu" 
on line d, £ such that Cs(x, x’, Y, Y, 2, Z’, w, w). 
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(e) If segments ab and cd lie on parallel lines of type T then Cy(a, b, c, d) if 
and only tf ab and cd are affine congruent. 


Within affine geometry it is easily seen that the set of all lines parallel to 
some given line constitutes an affine geometry of one lower dimension, in 
which the parallel lines play the role of points, the unique planes through 
any two of them play the role of lines, etc. The same is easily seen to hold for 
subaffine geometries. For any given line K of type T, I will refer to the three- 
dimensional subaffine geometry of lines parallel to K as G(K). In fact G(K) 
will be a full three-dimensional affine geometry, since there exists a plane 
through any two parallel lines in a convex subaffine geometry. It is known 
from standard synthetic geometry that three-dimensional Euclidean 
geometry can be axiomatised in terms of affine betweenness and con- 
gruence, in the Hilbert tradition. (Explicit axioms are given in the 
Appendix.) The following axiom expresses the well-confirmed empirical 
fact that the possible arrays of rigid bodies relatively at rest are determined 
by the laws of Euclidean geometry, and also that the affine betweenness 
in this geometry determined by straight-edges (i.e. the relation 
B;(a, a’, b, b', c, c’) agrees with the affine betweenness derived from the 
subaffine geometry of inertial and optical lines. The latter claim is hard to 
test directly, as are all statements about the inertial-optical subaffine 
geometry, but it has some indirect support e.g. from the observed fact that 
inertial motion proceeds along straight lines in the metrical geometry 
defined from rigid bodies. The axiom is obviously true in Maxwell 
geometry. Note that the Euclidean geometry of G(K) induces a Euclidean 
geometry on the points of any affine hyperplane not parallel to K, since each 
such point belongs to a unique line parallel to K. 


Axiom 4 (Euclidean geometry): For any line K of type T, the relation 
Cs(A, B,C, D) defines a three-dimensional Euclidean geometry within the 
three-dimensional affine space G(K), and for any segments aa’, bb’ and cc’ 
parallel to K we have B;(a, a’, b,b’, c, c’) iff the line bb’ is between the lines aa’ 
and cc’ in G(K). 


We now require two axioms placing some constraints upon the process of 
light propagation. For any point p, let L(p) be the set of all points q such that 
the line pq is of type L. 


Axiom 5 (Light cone): There is a T segment pq with midpoint o and a 
subaffine hyperplane F through the point o such that the intersection S of L(p) 
and L(q) ts the surface of a Euclidean sphere in the Euclidean geometry induced 
on H by G(K) and C;(A, B,C, D). 


The physical content of this axiom has two parts. By affine homogeneity, 
we know that any L line is parallel to some L line through p, so that the 
constraints here placed upon the L lines through p and q will apply to all LZ 
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lines of the space. The first part is that each plane through the line pq 
contains only two L lines through each point, i.e. that light propagates along 
a fixed space-time path in each spatial direction, independently of any other 
facts such as the velocity of the source. (This is true in classical electro- 
dynamics but was denied, e.g. in the optical theory of Ritz.) The second part 
is the constraint placed on the subaffine shape of the locus L(p) by the 
spherical shape of the set S, which ensures that L(r) for any point r will be a 
quadric cone in the subaffine space-time geometry. 

It is appropriate here to introduce Einstein’s definition of simultaneity in 
our formal context. Let o be the affine mid-point of the T segment ab, and let 
c be a point not on the line ab. The point c is said to be simultaneous with the 
event o, with respect to the line ab, if the two segments ac and bc are of type 
L. In other words, o is simultaneous with c if equal clock time elapses along 
the line ab between a and o and between o and b, while a light signal makes a 
round trip from a to c and back from c to b. Note that this empirical measure 
of simultaneity is appropriate in Maxwell geometry as well as in Minkowski 
geometry, on condition that the T segment ab be at rest with respect to the 
ether frame, so that the speed of light will be the same on the outward and the 
return journey. From this viewpoint we see that the spherical shape of S is 
simply an expression of the constancy of the speed of light in all directions, 
as measured in the ether frame. Thus Axiom 5 is true in Maxwell geometry. 

The last axiom expresses the constancy of the speed of light with respect 
to all inertial observers. Three points o, a, b are said to form a light triangle if 
oa is timelike, b is simultaneous to o with respect to the line oa, and ab is 
lightlike. A light triangle is the physical structure one erects in order to 
measure the one-way speed of light, after having adopted Einstein’s defi- 
nition of simultaneity. 


Axiom 6 (constancy of the speed of light): Let two light triangles o, a, b and 
o', a’, b' be given, and let bc be parallel to oa and b'c' be parallel to o'a'. Then 
Cr(0, a, o', a’) if and only if Cs(0, a,b, c,0', a’, b’,c’). ` 


The spatial distance covered by the light ray during the time interval oa is 
measured by a rod connecting the T segments oa and bc. Thus the axiom 
guarantees that light will always traverse equal spatial distances in equal 
clock times, and hence be travelling with the same speed in both cases. Thus 
Axiom 6 implies that Axiom 5 is true for all T directions. 

Axiom 6 is the only axiom not true in Maxwell geometry. In Maxwell 
geometry Axiom 5 is true only for a distinguished parallel family of T lines 
(physically interpreted as those at rest in the ether frame). T lines of any 
other direction will violate Axiom 5, #.e. observers moving along those lines 
will observe the speed of light to be different in different directions. (This is 
the geometrical basis for the expected outcome of the Michelson-Morley 
experiment, which has a very simple geometrical representation in Maxwell 
geometry.) This expectation is a consequence of the laws of Maxwell 
geometry relating the congruence relations Cs and Cpr to the underlying 
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inertial-optical structure, which are different from the law expressed in 
Axiom 6. 


5 CONSTRUCTION OF COORDINATE SYSTEMS 


Let the above axioms now be accepted as true. An observer in inertial 
motion follows a line K of type T. Let this observer construct a coordinate 
system following Einstein’s prescriptions, taking a fixed point o on K as 
origin. By Axiom 4 the set of all lines parallel to K possesses a Euclidean 
geometry G(K) with respect to the relations Bs(A, B, C) and Cs(A, B, C, D). 
Thus the standard procedures of spatial measurement using rigid rods will 
yield Euclidean coordinates for G(K). To complete the construction we 
need a time coordinate ¢ to distinguish among the events on each line. This is 
the function of Einstein’s definition of simultaneity. 

Now let p and q be two points on K such that o is the midpoint of the 
segment pq. It follows from Axioms 5 and 2 that the sets L(p) and L(g) are 
parallel quadric cones, and hence their locus of intersection S will lie in a 
hyperplane H by affine geometry. (One may also appeal to the fact that this is 
true in Minkowski geometry, and that the optical and affine structure of this 
space is already fixed to be the same as that of Minkowski geometry by 
Axioms 1-5.) By the definition of simultaneity, all of the points of S are 
simultaneous with o with respect to the line K. Thus pos is a light-triangle 
for each point s of S. Hence by Axioms 6 and 3(b) we have Cs(K, A, K, B), 
where A and B are any two lines through points of S parallel to K. Thus S is 
a Euclidean sphere in G(K). Now select a Euclidean coordinate system 
x,y,z for G(K) for which this sphere has radius c, for some arbitrarily 
chosen constant c. By affine geometry, there is a unique affine coordinate 
system which assigns affine coordinates x, y, z, t to each point in such a way 
that each point has the coordinates x, y, z belonging to the line parallel to K 
in G(K) on which it lies, all points on the line K have x = y = x = 0, t(0) =o 
and t(q) = 1. (T.e. the line K is scaled with o as origin and og as unit, and t 
coordinates are defined by hyperplanes parallel to H.) 

It is obvious that the L segments ps for s in S will satisfy the equation 


c?dt? (ps) = dx? (ps) + dy?(ps) +dz?(ps). (1) 


By Axiom 2 every L line in the space is parallel to one of the segments ps. By 
affine geometry parallel lines have proportional coordinate differences, and 
hence all L lines of the space satisfy this equation. 

Now let K' be any other line of type T, and let x’, y’, 2’, t be some system of 
coordinates constructed in the same manner with respect to K’. The L lines 
will also satisfy the equation (1) in the primed coordinates. Since the primed 
and unprimed coordinates are both affine coordinates within the same 
subaffine geometry, they are related to one another by an inhomogeneous 
linear transformation. It is well known from the ordinary coordinate-based 
formulation (e.g. Friedman [1983], p. 140) of the theory that any inhomo- 


The Physical Content of Minkowski Geometry 41 


geneous linear transformation which preserves the equation (1) is a Lorentz 
transformation (plus a Euclidean rotation, affine translation and dilation). 
Thus the above axioms are sufficient to yield the basic conclusion of the 
standard theory, i.e. that coordinate systems constructed in the standard 
manner will be related by a Lorentz transformation. 

Now let only Axioms 1—5 be accepted. Let T segments pg (and the lines on 
which they lie) and hyperplanes H satisfying Axiom 5 be called rest lines and 
hyperplanes of simultaneity respectively. It follows from Axioms 1 and 2 that 
any line (hyperplane) parallel to a rest line (hyperplane of simultaneity) is 
one also. In place of Axiom 6 we take 


Axiom 6' (absolute time): Let H and H be two hyperplanes of stmultaneity, 
and let pq and p'q' be T segments, with p and q in H and p' and q' in H'. Then 
Crip’; q, q^). 


Axiom 6' says that hyperplanes of simultaneity cut any two T lines in 
congruent segments. It is easy to show from Axioms 6’, 1 and 3(e) that then 
all hyperplanes of simultaneity must be parallel. Thus the physical content 
of Axiom 6’ is that all clocks will measure the same time interval between any 
two hyperplanes of simultaneity, regardless of their state of inertial motion. 

For construction of coordinates in Maxwell geometry, we again take an 
origin o on a T line K. Take the unique hyperplane H of simultaneity 
through o, and take Euclidean coordinates x,y,z on H for the geometry 
G(K), with origin o. Let p be any point on K other than o. There is a unique 
affine coordinate system x, y, z, t which agrees with x, y, z on H, has x,y,z 
constant along all lines parallel to K, sets t(0) = o, t(p) = 1, and has t 
constant on all hyperplanes of simultaneity. It is easy to see that any two 
such coordinate systems are related by a Galilean transformation (plus a 
Euclidean rotation, an affine translation and dilation, and a scale transfor- 
mation of the time coordinate). 

This shows that the essential physical difference between Maxwell 
geometry and Minkowski geometry, expressed in the transition from the 
Galilean to the Lorentz transformations for coordinate systems constructed 
with rods and clocks, is fully represented axiomatically by the substitution 
of Einstein’s law of the constancy of the speed of light (Axiom 6) for the 
classical law of absolute time (Axiom 6’). This seems to establish several 
major points. First, it supports my original claim that the physical content of 
the special relativistic space-time theory can be expressed in an objective 
way as a set of physical propositions about the types of physical fact which 
are involved in the construction of a coordinate system in the standard 
manner specified by Einstein. Second, it supports my claim that the physical 
concept of causation does not occur as part of the subject-matter of the 
space-time theory, except in so far as that concept is contained in the five 
concepts used here: propagation of light rays, free inertial motion, use of 
clocks, application of straight-edges, and transport of graduated rods. 

Third, it shows the crucial importance for the space-time theory of 
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Einstein’s law of the constancy of the speed of light, since that is the only new 
assumption which need be added to Axioms 1-5 (true already in Maxwell 
geometry), to arrive at the essential proposition of the new theory. 


6 NON-STANDARD KINEMATIC MODELS 


Some further important conclusions may be drawn by taking into account 
what does not follow from these assumptions. Let J be the theory consisting 
of Axioms 1-6, and J” be the theory consisting of Axioms 1-5 and 6’. These 
two theories are stated in the same formal language, containing the five 
primitive terms Bz, Br, Bs, Cr and Cs and given the physical interpretation 
outlined earlier.+ I shall call this the kinematic language, and models for it 
kinematic models. I have tried to demonstrate that theories J and J’ contain 
the essential physical assumptions which underlie the two informal space- 
time theories we are considering. In particular, I have shown that the 
coordinate transformation laws which are usually thought to give the 
essential physical content of those informal theories follow from the forma- 
lised theories J and J” respectively. 

However, certain features often associated with space-time theories are 
not present in the theory J, or in its classical counterpart J”. The first of 
these is the special principle of relativity, i.e. the principle that all physical 
laws are the same for an observer or system in any state of uniform inertial 
motion. This principle is obviously not a consequence of either space-time 
theory, since the kinematic language does not allow one to make statements 
about all physical laws. Certain instances of the principle are indeed 
contained in each theory. Axiom 4 ensures that the Euclidean laws 
governing structures of rigid bodies will hold equally for all inertial frames. 
Axioms 2 and 5 ensure that the affine laws of light propagation (e.g. that light 
travels along a rigid-body straight-edge) will hold equally for all inertial 
frames. Axiom 6 ensures that in J we will also have the metrical laws of light 
propagation hold equally in all inertial frames. (Of course this fails in ¥’.) 
However, neither theory contains a full-strength relativity principle assert- 
ing equivalence of all inertial frames for all physical processes, nor even a 
principle asserting such equivalence for all processes describable in the 
kinematic language. 

Of course the theory J is in this respect ahistorical, since Einstein’s own 
presentation of the relativistic kinematics relied heavily upon the relativity 
principle.? One of the purposes of the present analysis is to avoid the 
powerful but abstract relativity principle in favour of more concrete and 
explicit laws with a direct physical content as propositions about a 
specifiable class of observable physical facts. 

1 I am here assuming that some conventional resolution of the freedom of definition of the 
predicates Cs and Bs discussed in note N.1, p. 33 has been made, and additional axioms 
formulated so as to eliminate that source of variation among models of T and T”. 


2 The axiomatisations of Schutz [1973], [1981] involve a formalisation of the relativity 
principle, asserting the existence of certain automorphisms of the geometrical space. 


The Physical Content of Minkowski Geometry 43 


One interesting consequence of this first point is that the close connection 
between the group of coordinate transformations among inertial coordinate 
systems and the symmetry groups of the kinematic models or their 
substructures, which plays an important role in many recent analyses of 
space-time theories (cf. Friedman [1983]), is here severed. Any model of 7 
or J” will have the corresponding group of coordinate transformations, but 
need not have that same group as the automorphism or symmetry group of 
the model itself or a distinguished substructure. (For Maxwell geometry we 
distinguish between the Galilean symmetry group of the affine and metric 
structure alone, and the true symmetry group which also respects the 
absolute rest defined by the ether frame.) 

The second absent feature is the categorictty of standard ee 
axiomatisations, t.e. the fact that they determine a single model up to 
isomorphism. In fact the theories J and J” have many different models, 
and therefore the physical content of these standard space-time theories is 
consistent with a variety of different assumptions about other aspects of 
kinematic structure than the relations among inertial coordinate systems. 

The structural element of the kinematic models which is left open by the 
theories J and J” is: just which lines of the subaffine geometry shall be of 
type T. (This is highly significant physically because the distribution of T 
lines determines what inertial motions are possible according to the given 
kinématic structure.) Axiom 2 ensures that parallel lines are of the same 
type, and hence the question becomes, which parallel families are of which 
types. Axiom 5 ensures that there is a definite light cone, and then the 
convexity of the subaffine geometry (Axiom 1) ensures that all lines in the 
interior of the light cone (i.e. the affine region which contains at least one T 
segment by Axiom 5) are of type T. By the uniqueness of the light cone, 
subaffine lines outside the cone cannot be of type L. Therefore the only line- 
type information left open by the theories J and J” is which if any of the 
lines on or outside of the light cone are of type T, i.e. are possible paths of 
inertial motion. In physical terms this amounts to asking whether inertial 
motion can proceed at a speed equal to or greater than that of light. 

My contention is that nothing in either the classical or the special 
relativistic space-time theories provides any answer to this question. The 
evidence for this is that the theories J and J” seem to formalise adequately 
the physical content of those space-time theories, and yet do not fix an 
answer to this question. There are answers which have been traditionally 
given to this question in both cases: The classical space-time theory has been 
traditionally represented by the particular kinematic model in which just the 
lines not lying in one of the hyperplanes of simultaneity are of type T. This 
amounts to asserting that inertial motion is possible at all finite velocities, 
but not at infinite velocity. The relativistic theory has been traditionally 
represented by the particular kinematic model in which just the lines lying 
inside the light cone are of type T, i.e. inertial motion is possible only at 
speeds less than that of light. (I will call these the standard kinematic — 
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models.) Certain reasons may be given for these choices, but I maintain that 
those reasons do not derive from the space-time theories J and J”. 

One possible reason is that the standard models satisfy various relativity 
and symmetry principles. There are models of the theories J and J” in 
which, for example, inertial motion at or exceeding the speed of light is 
possible along some spatial directions but not others. So long as the whole 
set of forward T lines through any point forms a convex cone we still have a 
model of the theory, but Euclidean rotational invariance is violated. The 
standard kinematic models satisfy these natural invariance principles. 

However, these symmetry principles still do not suffice to single out the 
standard models. For J” the symmetry principles allow a model in which 
the lines lying in the hyperplanes of simultaneity are also of type T, t.e. 
inertial motion at infinite speed is possible. For J we have models in which 
inertial motion along all null lines or along all null and spacelike lines is 
possible. These models may be argued against on dynamical grounds: the 
standard dynamical theories of the processes thought to determine the states 
of inertial motion in classical and relativistic space-time respectively will not 
allow an object to be accelerated from a state of motion of the allowed type 
into one of the forbidden types by a force of finite magnitude acting for a 
finite time. But these limitations on accelerative processes are not part of the 
space-time theories, and in any case they do not prevent the existence of 
particles travelling along the forbidden trajectories, so long as these states of 
motion arise through some other process (e.g. particle decay) than gradual 
acceleration by a finite force. 

The third type of argument which might be offered in favour of the 
standard models would be that inertial motion is a form of causation, so that 
if inertial motion faster than light is possible then there will be closed causal 
cycles and consequent causal paradoxes (e.g. Bohm [1965], p. 158). In reply 
we may note that the kinematic theory alone does not in any way guarantee 
that for any physically possible inertial motion it will be possible to contrive 
a causal process which shall reliably traverse the path corresponding to that 
motion, in either direction. This depends completely upon the dynamical 
laws (and their causal character) which govern whatever the processes may 
be whereby particles moving with supra-luminal velocities arise. Therefore 
once again the supposed impossibility depends upon assumptions about the 
dynamical laws of the world, not merely upon the kinematics. 

I conclude that neither the relativity or symmetry principles nor the 
existence of a privileged standard kinematic model are part of the actual 
physical content of the standard space-time theories. In particular I claim 
that nothing in the standard kinematic theory (formalised here as 7) 
prevents the occurrence of inertial motions at or exceeding the speed of 
light. 

This completes my positive analysis of the physical content of the special- 
relativistic space-time theory. The remainder of the paper is devoted to 
direct criticisms of the causal interpretations, and of Reichenbach’s claim 
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that the space-time theory presupposes the absence of superluminal 
causation. 


7 CRITICISM OF THE CAUSAL INTERPRETATIONS 


I will now use the present axiomatic formulation of the space-time theory as 
a basis for criticism of the various interpretations which have been proposed 
taking causal notions as primitive. These interpretations are sometimes 
given in an explicit axiomatic form, as e.g. in the works of Robb [1914-1936], 
Reichenbach [1924], and Mehlberg [1935]. In other cases, they take the 
form of informal characterisations of the physical content of the space-time 
theory in terms of causation, as in the discussions of Reichenbach [1928], 
and in the discussions of Grinbaum [1973], van Fraassen [1970], and 
Salmon [1980], all of which follow Reichenbach on a crucial point to be 
questioned here. 

There are three separate point of disagreement. These are: (a) What is the 
space-time theory a theory of, and in what if any sense is it a theory of 
causation? (b) What is the basis and meaning of the relativity of simultaneity 
in the space-time theory? (c) Does the space-time theory allow causation 
faster than light? 

Point (a) has already been addressed. An explicit axiomatic reconstruction 
of a physical theory offers a definite answer to the question what the theory is 
about, in its choice of primitives and its explanation of their physical 
interpretation. The causal formulations present the space-time theory as 
being about causation, in the sense that they include among their primitives 
a binary relation K(a, b) among points (events), interpreted as the (possible) 
existence of a causal chain from a to 6. The most complete and rigorous 
causal axiomatisation, that of Robb, uses this as the sole primitive. Thus, the 
space-time theory is explicitly construed as a theory of possible causal 
processes. The content of the theory consists in a set of axioms restricting 
the structure of the total system of causal relations.! 

The present formulation denies this, since the theory is axiomatised 
without any such relation being taken as primitive. My view is that the 
space-time theory of special relativity is a theory of physical facts of the five 
types discussed in section 2. Of course, these include certain causal 
processes, such as the propagation of light, but no claims are made about all 
causal processes. The explicit formal reconstruction of the process of 
constructing a standard coordinate system given above is intended to show 
that no appeal to a general notion of causation is required or suggested by the 
standard formulation. I remarked earlier that the word ‘cause’ does not even 
occur in Einstein’s [1905] paper. Clearly the burden of proof is upon the 
proponents of a causal interpretation to explain at what point they believe 


! Reichenbach [1924] (1969 ed., p. 15) says, “Che physical world consists of causal chains, 
which show certain structural relations that can be formulated as axioms.’ 
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the general concept of causation to enter into the standard interpretation of 
the theory. (Reichenbach’s argument will be considered below.) 

Of course one may propose a theory of ‘universal causation’ using the 
formalism of Minkowski geometry, taking as primitive the relation K(a, b) 
definable therein and interpreting it as referring to any physically possible 
causal process. My claim is simply that this is a different theory from the 
space-time theory of special relativity, since it deals with a different subject- 
matter. The observations which define the subject-matter of the space-time 
theory on the standard interpretation are observations of instances of the five 
relations taken as primitive in the present formulation, and not at all 
observations (if such a thing even be possible) of general binary causal links. 

What led the causal theorists to the view that the special relativistic space- 
time theory involves a theory of causation? Robb’s reasons are somewhat 
murky, and in any case have not been influential in the philosophical 
literature on relativity. Reichenbach’s reasons are quite clear, and have been 
widely accepted by philosophers writing on relativity. They are connected- 
with the notion of the relativity of stmultanetty, and lead us into topics (b) 
and (c). In brief, Reichenbach argues that Einstein’s definition of distant 
simultaneity, and the resultant relativity of simultaneity, depend upon the 
absence from nature of arbitrarily rapid causal chains.' Therefore, it is an 
assumption of the theory that such causal chains do not exist. This 
assumption must be included explicitly among the axioms, in order to 
present the full physical content of the theory.? Reichenbach has been 
followed on this point by many subsequent philosophical commentators.’ 

I believe that this argument of Reichenbach’s is completely mistaken, and 
rests upon a simple equivocation upon the notion of simultaneity. The 
concept of simultaneity which figures in the space-time theory of special 
relativity is that which Einstein defined, and which I formalised above. That 
concept specifies a certain physical relation which may obtain between an 
inertial line, a point on it, and a point not on it. The relation is defined in 
terms of the affine structure, light rays, and clocks. I will refer to this 
simultaneity relation as optical simultanetty. The relativity of optical 
simultaneity is a fact of Minkowski geometry and is a theorem of J; it 
consists in the fact that the hyperplanes of optical simultaneity defined by 
two different T lines at the same point o are not the same. This is the 


1 Eg., ‘the relativity of simultaneity has nothing to do with the relativity of motion. It rests 
solely on the existence of a finite limiting velocity for causal propagation’ (Reichenbach 
[1928], section 22; 1957 edition, p. 146). Friedman [1983], pp. 167-8 offers a more charitable 
reading of this passage which takes it to express a different proposition from what it seems to. 
But all of Reichenbach’s follower’s seem to have read him the way I do here, and to accept his 
statement at face value. See n. 3. 

2 “These axioms formulate the limiting character of the velocity of light. This assertion must 
not be regarded as a consequence of the theory of relativity; it is one of its presuppositions 
which can be tested experimentally (Reichenbach [1924] section 20; 1969 edition, p. 92). 

3 Griinbaum [1973], pp. 347-57; van Fraassen [1970], pp. 153-5. Salmon [1980], p. 122 says, 
‘the presence of the relativity of simultaneity in special relativity hinges crucially upon the 
existence of a finite upper speed limit on the propagation of causal processes and signals’. 
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relativity of simultaneity which is asserted by the space-time theory of 
special relativity. 

Reichenbach’s analysis introduces an entirely different relation, which I 
will call causal simultanetty. It consists in the absence of possible causal 
connection between the two events. If causal signals of all finite (but no 
infinite) velocities are possible, then the set of events causally simultaneous 
with a given event o will constitute a hyperplane H (allowing other plausible 
assumptions). Unlike the case of optical simultaneity, this hyperplane A will 
depend only upon the event o, not upon a timelike direction T through the 
point o. This constitutes the absoluteness of causal stmultaneity, in such a 
world. In a world where arbitrarily fast causal chains do not exist, such a 
unique hyperplane H of causal simultaneity will not exist. 

Reichenbach’s argument, I believe, turns upon a straightforward equivo- 
cation between optical and causal simultaneity. Einstein satd nothing about 
causation or causal simultanetty, and there was no reason for him to have done 
so. The concept of causal simultaneity is a pure invention of Reichenbach’s, 
introduced by him in the attempt to explicate Einstein’s reasoning, but in 
fact having no bearing at all upon the foundations of the relativistic 
kinematics. I will now try to substantiate this claim. 

In the course of explaining the background to Einstein’s definition of 
simultaneity, Reichenbach imagines an attempt being made to synchronise 
distant clocks by exchange of causal signals, and points out that such an 
attempt would give a definite result if and only if causal simultaneity were 
absolute. No reference whatever to any such procedure of causal syn- 
chronisation of clocks is to be found in Einstein’s paper; this is part of 
Reichenbach’s ‘epistemological analysis’ of Einstein’s definition. 

Reichenbach and the other authors cited attribute to Einstein an 
assumption that Reichenbach’s procedure of causal synchronisation would 
fail, i.e. an assumption of non-absoluteness of causal simultaneity. J see no 
ground whatever for such an attribution, either in Einstein’s text, or in the 
logical structure of the space-time theory itself. 

The main argument for the attribution (which is given by most of the 
cited authors), is that there would be no need for Einstein’s definition of 
simultaneity, if causal simultaneity were absolute. There is also perhaps a 
suggestion by some of these authors that Einstein’s definition would be in 


1 Griinbaum [1973], ch. 12, section C is entitled, ‘History of Einstein’s Enunciation of the 
Limiting Character of the Velocity of Light in vacuo’, pp. 369-86. Griinbaum quotes only one 
passage of Einstein’s in his discussion of the supposed enunciation (pp. 371-2), and no 
reference to causation occurs in the passage. There are references to the principle of relativity 
and to the relativity of simultancity, but I see nothing in the passage to support a construal of 
these remarks as involving an appeal to Reichenbach’s causal conception of simultaneity. The 
burden of proof seems to be upon those who support this reading of Einstein’s assumptions to 
produce some clear textual evidence for it. This evidence should not depend upon the 
assumption (apparently involved in Griinbaum’s reading of the passage he quotes) that 
remarks concerning the absoluteness or relativity of simultaneity are to be understood with 
reference to Reichenbach’s notion of causal simultaneity. 
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some way inadmissible or out of place, in a world in which causal 

simultaneity were absolute. 

These arguments seem to me to be wrong. We saw above that the relation 
of optical simultaneity occurs in the statement of Einstein’s law of the 
constancy of the speed of light for all inertial frames. That law is a statement 
about the relations among rods, inertially moving clocks, and light rays. The 
relation of optical simultaneity defines a particular type of physical 
situation, concerning which the law makes certain assertions. Therefore the 
obvious and sufficient motive for introducing the relation of optical 
simultaneity into the discussion is that this relation occurs in the statement of 
the law which Einstein wished to propound. This law is fully expressible in 
terms of the five primitive physical concepts of the kinematic language. It 
makes no reference to causation. There is thus a perfectly sound logical 
reason for introducing the relation of optical simultaneity into an exposition 
of the space-time theory, in complete disregard of the presence or absence of 
any causal simultaneity relation in the actual world, or of any questions at all 
concerning causation in general. t 

The line of thought of Reichenbach and his followers seems to depend 
upon the assumption of a kind of privileged status for causal simultaneity, 
such that if causal simultaneity were absolute then no other sort of 
simultaneity relation could be of any use or relevance for physics. They 
think of Einstein’s optical definition of simultaneity as filling a gap which is 
left by the non-absoluteness of causal simultaneity, in such a way that if that 
gap were not present then there would be no place for Einstein’s definition. ? 
‘This is the only explanation I can see for their belief that Einstein must have 
assumed the non-absoluteness of causal simultaneity when he introduced 
his optical simultaneity relation. The above discussion seems to show that 
this is not so. Einstein’s optical simultaneity relation has nothing to do with 
1 Salmon [1980], p. 124 says, ‘If, however, light were not a first-signal, there would hardly seem 

to be any physical justification for according light such a significance in our space-time 
schemes.’ We here offer one such justification: the constancy of the speed of light entirely 
suffices to account for the special role of light in the kinematical theory, without reference to 
its supposed limiting character. 

2 Many of these authors (particularly Reichenbach and Grinbaum) seem to be led in this 
direction by an interest in identifying Einstein’s definition as involving a convention, and a 
belief that the appropriateness of such a convention depends in some way upon some 
objective indeterminacy in the corresponding subject-matter. I think this viewpoint to be 
mistaken in at least two ways. First, ‘simultaneity’ does not constitute a definite physical 
subject-matter until it is specified what physical relations are supposed to determine it. So 
even if it were true that definitions require ‘gaps’ in nature, the existence of a ‘causal 
simultaneity gap’ has no bearing on the appropriateness of an optical definition of 
simultaneity. Second, there seems to be no necessity to think of Einstein’s definition in this 
way, i.e. as conventionally ‘filling the gap’ left by the non-absoluteness of causal simultaneity. 
The text seems to show that a perfectly adequate account of Einstein’s definition of 
simultaneity may be given in essentially syntactical terms, as a definition in the strict formal 
sense: an abbreviation for a complex description of a certain special type of physical situation 
involving clocks and light rays, the abbreviation being introduced in order to simplify the 
statement of a law in which this description occurs. Surely there is no reason to think that the 


appropriateness of an abbreviation should depend upon the presence or absence of absolute 
causal simultaneity in the world. 
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causation or causal simultaneity, and is perfectly capable of being intro- 
duced on its own in connection with the law of the constancy of the speed of 
light, in complete disregard of all questions about causation. The earlier 
axiomatisation in the kinematic language brings this out in an explicit way. 

One other argument is sometimes suggested, namely that the ‘relativity of 
simultaneity’ breaks down if there are arbitrarily fast causal chains, so that 
the space-time theory of special relativity would be undermined (e.g. 
Salmon [1980], p. 122). This argument makes manifest its dependence upon 
an equivocation between the two physical simultaneity relations. What 
breaks down if there are arbitrarily fast causal chains is the relativity of 
causal simultaneity, whereas what is essential to the space-time theory is the 
relativity of optical simultaneity. The latter does not necessarily break down 
under those conditions, since it does not depend upon any facts about 
causation in general at all, but only upon assumptions like those expressed in 
the axioms given earlier. 

A graphic illustration of the points at issue here is provided by one of the 
non-standard kinematic models discussed in the last section. This model is 
constructed by superimposing the congruence structure of the standard 
model of Minkowski geometry over the optical-inertial structure of 
Maxwell geometry. If we add the further assumption that the processes of 
inertial propagation faster than light allowed by the Maxwellian inertial 
structure are causal in character, the resulting kinematic model possesses 
absolute causal simultaneity while at the same time satisfying the theory 7, 
so that the inertial frames of reference are connected by Lorentz transfor- 
mations, and the whole of the standard space-time theory remains in effect. 
This shows graphically that the presence or absence of absolute causal 
simultaneity is irrevelant for the space-time theory J, and for the 
kinematics of special relativity. (Of course this model violates the special 
principle of relativity because of the distinguished hyperplanes of causal 
simultaneity, but that principle is not part of the theory J.) 


APPENDIX: Axioms for Affine and Euclidean Geometry 


Affine geometry is essentially the geometry of lines and parallelism, with no 
reference to distance or angle.’ I will begin by stating a complete set of 
synthetic axioms for four-dimensional affine geometry. It is significant that 
many axioms are required. This illustrates how many distinct physical 


1 The term ‘affine geometry’ is used in a number of different, though related, senses. In the 
mathematical literature it is often used for a much wider class of geometries satisfying most of 
the present axioms, but not the strong axioms implying the existence of many points, such as 
the Axiom of Continuity, and Axiom O-6. These weaker axiom systems are not categorical, 
and have many different types of models, the investigation and classification of which is a 
significant branch of geometry. (An excellent discussion of such affine geometries in the plane 
case may be found in Blumenthal [1961].) Secondly, in the literature relating to differential 
geometry and general relativity one sometimes finds the term ‘affine geometry’ used for a 
manifold equipped with an affine connection. The usage here corresponds to the special case 
where the connection is flat. 
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assumptions are hidden in the usual coordinate representation of the 
inertial-optical affine structure. 

The axioms are stated in terms of a single primitive term, a three-place 
relation B(x, y, z) over the basic elements x,y, 7z. The elements are to be 
interpreted as points, and the relation B(x, y, 2) as signifying that y lies 
between x and z on the straight line determined by x and z. The advantage of 
this primitive term is that it conveys information about which points lie on 
which lines, and also about the order of the points along a given line. Careful 
attention to questions involving order is one of the respects in which modern 
investigations of the foundations of geometry have gone far beyond classical 
geometry. (Cf. Hilbert [1899], Forder [1927], Borsuk and Szmielew [1960].) 
However, while of great technical significance, these questions are not of 
much significance from the viewpoint of intuitive geometry, since the 
properties of the line and plane which they express are so basic as to seem 
trivial and devoid of geometrical or physical interest. These constitute the 
second group of axioms below. 

The axioms which I shall state here are based, with some modifications, 
on those given by Hilbert, Forder, and Borsuk and Szmielew. The latter two 
works follow Hilbert in their main lines, with modifications in the direction 
of a simpler set of basic concepts. (For example, Hilbert assumes lines and 
planes as separate types of object, whereas the latter two works construe 
them as sets of points.) I have here mixed elements from all three treat- 
ments. The only novelty is that the axioms here are for four dimensions 
rather than three. 

In this appendix, I will for convenience write ‘x|y|2’ in place of 
‘Bx y, 2Y. 

First some definitions. Let a and b be two distinct points. The line through 
points a and b is the set of all points x such that xļa]b or a]x|b or alb| x, 
together with a and b. This set will be referred to as the line ab, and points in 
it will be said to /ze on that line. Let a, b, and c be three distinct points which 
do not all belong to one line. The plane through a, b and c is the set of all 
points x which lie on some line yz, for points y and z on lines ab, bc, or ac. 
This set will be called the plane abc. Finally, let a, b, c, and d be distinct 
points which do not all lie in one plane. The hyperplane abcd is the set of all 
points x which lie on some line yz, for points y and z in planes abe, abd, acd, 
or bcd. (These are the planes of the faces of the tetrahedral solid whose 
vertices are a, b, c, and d.) Two lines in the same plane which do not 
intersect, or two planes in the same hyperplane which do not inter- 
sect, or two hyperplanes which do not intersect, are called parallel. 


Axiom Group 1: Axtoms of Incidence 


Axiom I-r: Any two distinct points a and b lie on a unique line. 
Axiom I-2: Any three distinct points a, b, and c which do not all lie in a 
line, lie in a unique plane. 
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Axiom I-3: Any four distinct points a, b, c, and d which do not all lie in a 
plane, lie in a unique hyperplane. 

Axiom I-4: There exist five points a, b, c, d, and e which do not all lie in one 
hyperplane. 

Axiom I-5: Any two planes which are not parallel intersect in some point. 


Note that the existence of lines, planes and hyperplanes through the 
appropriate sets of points is already a consequence of our definitions. The 
real content of the first three incidence axioms lies in the uniqueness of these 
sets, e.g. that there is no other line through two points a, b than the line ab. 
This ensures that the line ab is identical with the line cd for any two points 
c,d on the line ab. 

The last two axioms determine the dimensionality of the space. Axiom I-4 
determines that there are at least four dimensions. Axiom I-5 determines 
that there are at most four. This is because in a five-dimensional affine space 
there can exist ‘skew planes’, t.e. planes which are not parallel and do not 
intersect at all. (In a four-dimensional affine space two non-parallel planes 
may intersect in only a single point, while in three dimensions they must 
intersect in a line.) 

The set of points between two points a and b is called the segment ab. 
Affine geometry possesses a weak notion of affine congruence between 
segments, defined by the stipulation that opposite sides of a parallelogram 
are affine congruent to one another, and that affine congruence is to be 
transitive. This relation is defined only for segments lying on parallel lines, 
unlike the full congruence relations of Euclidean geometry, or the relations 
Cs, Cr of the text. 


Axiom Group 2: Axtoms of Order 


Axiom O-r1: If a|b|c then the points a, b, and c are distinct. 

Axiom O-2: If a|[b|c then c|bla. 

Axiom O-3: If a|b|c then not b]a|c. 

Axiom O-4: If a, b, and c are distinct and lie on a line, then a|b|corb|c|a 
ora|c]b. 

Axiom O-5: If a and b are distinct, then for some point c, a|b|c. 

Axiom O-6: If a and b are distinct, then for some point c, a|c|b. 

Axiom O-7: If a|b|c and b|c|d, then a|b|d. 

Axiom O-8: if a|b|d and b|c|d, then a|b|c. 

Axiom O-9 : Let a, b, and c be three points not on a line, and let L be a line 
contained in the plane abc, and Jet L contain a point x such that a| «|b. Then 
L contains a point y such that b|y|cora|y|c. 


The first eight of these axioms refer to order of points on a line, and are 
geometrically somewhat trivial. The ninth is known as Pasch’s axtom, or the 
plane axiom of order. It says that if a line cuts one side of a triangle then it cuts 
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one of the other sides. It enables one to define the order of lines in a plane, by 
reference to the order of the points in which they cut some fixed line. 
There are two remaining axioms for affine geometry. 


Axiom of Parallels : For any line ab and any point c not on the line ab, there 
exists a unique line L through c parallel to ab, lying in the plane abc. 


Axiom of Continuity : Let P and Q be two arbitrary sets of points. If there 
exists a point u such that p in P and q in Q implies u |p |q, then there exists a 
point v such that p in P and q in Q implies p|v|qor p = v org =v. 


The Parallel Axiom is familiar from Euclidean geometry. The Axiom of 
Continuity ensures that the points on a line have the order-type of the real 
line. 


These axioms constitute a categorical set, i.e. they have only one model, 
up to isomorphism. The standard numerical model A* is R*, with the 
relation B(a,b,c) defined to hold iff the vector c-b is a positive scalar 
multiple of the vector b-a. This defines the standard affine structure on R+. 
An affine coordinate system for a model M of these axioms is a 1-1 map from 
M to Af which preserves the relation B(a, b,c) in both directions. The 
standard representation theorem for four-dimensional affine geometry 
asserts the existence of such a coordinate system for any model M of the 
axioms, and the uniqueness of such coordinate systems up to automorph- 
isms of the structure A*. These automorphisms are just the non-singular 
linear transformations of R4. Such a coordinate system is defined by five 
points 0,a,b,c,d of M not lying in a common hyperplane, where o is 
specified as the origin, and the ordered set of segments oa, ob, oc, od are 
mapped to the unit basis vectors of A‘. 

The text uses the concept of a subaffine geometry, in which not all pairs 
of points lie on a line. The easiest way to modify the above axioms to 
characterise subaffine geometry is as follows. Define a and b to be colinear if 
there is some c which bears the relation B to a and b. (So if a and b are not 
colinear then the line ab in the earlier sense will consist of just a and b.) Such 
a line ab will be called improper, the other lines being proper. The axioms will 
be weakened so as not to imply the existence of a proper line through each 
pair of points. Add to Axiom I-2 the hypothesis that the lines ab and ac 
are proper. Add to Axiom I-3 the hypothesis that the lines ab, ac, and ad are 
proper. Add to Axiom I-4 the hypothesis that the lines ab, ac, ad, and ae are 
proper. Add to Axioms O-5 and O-6 the hypothesis that the line ab is proper. 
Add to Axiom O-ọ the hypothesis that the lines ab, bc, ac and L are all 
proper. Add to the Parallel Axiom the hypothesis that the line ab is proper, 
and let the conclusion state there is a unique proper line L having the given 
properties. It now follows from the definitions of plane and hyperplane and 
the modified Parallel Axiom that a plane contains at least two parallel 
families of proper lines, and a hyperplane contains at least three, which are 
not coplanar. (I have formulated the subaffine Parallel Axiom so as to imply 
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that proper lines form parallel families. All physically interesting subaffine 
geometries have this property.) 

For Axiom 1 of the text we need the notion of a convex subaffine geometry. 
For this it suffices to add the following: 


Convexity Axiom: There is a point p and a set S of points such that (a) a 
line L through p is proper iff it includes an element s of S distinct from p, and 
(b) if psqs' is a parallelogram with s and s’ in S then the line pr is proper for 
any point r on the segments sq and s’q. 


It is easy to see by using property (b) that if this axiom is satisfied by any 
set S it is satisfied by some convex S’, so that this axiom agrees with our 
informal definition of a convex subaffine geometry. 

Let M be any model of this modified axiom system. Axiom I-4 will ensure 
the existence of four linearly-independent parallel families of proper lines, 
so that affine coordinates may be constructed in the same manner as for full 
affine geometry, with coordinate lines parallel to these proper lines. This 
coordinate system will determine an embedding of the model into the full 
affine geometry Af, and the embedding will be unique up to affine 
transformations because the coordinates assigned to the five points 
a,b,c,d,e fix the whole embedding. (Of course these remarks do not 
constitute a proof. A detailed proof of the affine embedding theorem for an 
axiomatic subaffine geometry may be found in Mundy [a].) 

Axiom 4 of the text asserts that the relation Cs (A, B, C, D) defines a three- 
dimensional Euclidean geometry over the three-dimensional affine space 
G(K). In the context of our other axioms, Axiom 4 is equivalent to the 
following two assumptions (taken from Borsuk and Szmielew, pp. 81-82). 


Axiom 4 (a): Let there be given two planes P and P’ containing lines 
parallel to K (i.e. lines of G(K)), and lines A, B, C, D, A’, B’, C’, D' parallel to 
K (i.e. points of G(K)), with A, B, C on P, D not on P, and points A’, B’, C 
on P, and D’ not on P. If Bs(4, B,C), Bs(A’', B,C), Cs(A, B, A’, B), 
C;(B, C, B,C), C;(D, A, D’, A’), and Cs(D, B, D’, B), then 
C;(D, C, D’,C). 


Axiom 4 (b): Given a half-plane W in G(K) with edge P, a segment AB on 
P, and a triangle DEF in G(K): if Cs(A, B, D, E), then there exists a unique 
point Cin W with C;(A, C, D, F) and C,(B, C, E, F). 


University of Oklahoma 
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INTRODUCTION 


Genetics is the first branch of biology to be cast into axiomatic form and has 
become a favourite application in the philosophy of biology. After the 
publication of Mendel’s famous laws (Mendel [1901]) there was a steady 
development of genetic theories from what might be called character-factor 
genetics via linkage and fine-structure genetics to molecular genetics.1 At first 
glance, nobody would deny that this development was progressive. Closer 
inspection, however, reveals some problems. In order to say that a shift from 
atheory T toa theory T” is progressive, it is necessary to compare T and T in 
some sense. One way of comparison is to say that T can be reduced to T”. 
However, the question of reduction or reducability in genetics has not been 
resolved, in spite of continued attempts.” 

The main source of failure in our view is the inadequate description or 
presentation of the theories involved in the comparison; ‘inadequate’ with 
respect to just this aim of comparison. By this claim we do not want to offend 
the working biologist. His primary aim is to introduce improved theories 
and apply them to concrete phenomena. As history proves, presentations of 
genetic theories have been quite adequate to this aim. Modern textbooks 
provide elaborate bodies of theory together with rich fields of application.’ 
However, all this taken for granted, up to now, questions of comparison of 
those theories have not been discussed among geneticists. 


Recetved 18 Fune 1984 


1 See, for instance, (Carlsson [1966]) for an historical survey. 

2 Compare, e.g. the controversy between Hull and Schaffner as resulting in a series of papers. 
See, for example, Hull [1976] and Schaffner [1976]. 

3 Standard expositions are e.g. Goodenough and Levine [1974] and Strickberger [1968]. 
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With respect to similar situations in other fields, such questions may well 
be said to be of a foundational nature, and—as is usual in such cases—the 
best way- of attacking such questions is to lay bare the basic concepts and 
assumptions of the theories involved with a painful degree of precision and 
pedantry. : 

An axiomatisation in first-order logic of Mendel’s first law has been given 
in Woodger [1959]. Kyburg [1968] provides a modified version. No similar 
attempts have been made for the other genetic theories, however. We believe 
that this is due to the first-order formalism used by Woodger and Kyburg, 
which by making simple statements look extremely complicated, had no 
attractive, but rather a repellent effect. 

Since Suppes first advocated the use of informal set-theoretic predicates 
for reconstruction of empirical theories, a number of such theories have 
been described and reconstructed in this way.? This method yields precise 
formulations without spending much effort on mere formal syntax.? Sneed 
[1971] and Stegmueller [1976] have elaborated on Suppes’ proposal and 
introduced a complete ‘structuralist’ meta-theory,* Sneed also proposed a 
precise account of reduction based on preceding ideas of Adams [1959]. 
Dawe [1982] was the first to apply the method of set-theoretic predicates to 
genetic theories. 

In this and the following two papers, we will make a further attempt to 
reconstruct the genetic theories mentioned above, that is, to present their 
basic concepts and assumptions in a clear and precise way. By using the 
method of informal set-theoretic predicates as a means of reconstruction, we 
will arrive at relatively simple presentations (in comparison to e.g., 
Woodger’s). Our aim is to extend formal reconstruction beyond Mendelian 
genetics. Once molecular genetics, and linkage and fine structure genetics 
are reconstructed, the question of comparison and reducability can be 
attacked at a qualitatively new level: by means of formal logical investiga- 
tions. In part two we will introduce a reduction relation and show that the 
core of character-factor genetics reduces to the core of molecular genetics. 
In part three, Mendelian, linkage and fine structure genetics are re- 
constructed as specialisations of the core of character-factor genetics, and 
the question of their reducing to specialisations of molecular genetics will be 
investigated. By this, at the level of genetics, we hope to contribute to the 
foundational questions indicated above, and at the level of the philosophy of 


1 For instance in Suppes [1967]. 

2 A few examples from different areas are Balzer and Goettner [1983], Balzer and Moulines 
[1981], Balzer [1982] and de la Sienra [1982]. See Stegmueller et al. [1982], pp. 39-40 for 
further references. 

3 The relatively complicated looking expressions in D1 below do not prove the contrary. They 
mirror formal expressions of just this complexity which are used in genetic practice, and 
which appear more lucid only if restricted to special cases (like the diploid case with two 
factors). 

* See especially Stegmueller [1976], p. 34 for a description of the ‘method of informal set- 
theoretic predicates’. 
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science, we hope to contribute to further clarification of the concepts of 
reduction and progress. 

It is to be stressed that in parts 1 and 2 only the ‘basic cores’ of the 
respective theories are treated, which will appear rather empty. ‘Interesting’ 
special laws which are empirically non-trivial are introduced only in part 3. 
The distinction between ‘basic cores’ and ‘special laws’ has proven fruitful 
for the description of the synchronic as well as of the diachronic structures of 
empirical theories, and will not be defended here.* 

Of course, as with all reconstructions of theories, it can (and will) be said 
that they do not adequately represent the working scientist’s conception of 
those theories. Such an objection is easy to advance, in particular because we 
do not simply reproduce standard textbook accounts. 

To this objection, there are four replies. First, any reconstruction of any 
theory will cut off part of the informal theory from which it starts. This 
cannot be avoided because reconstruction involves a certain amount of 
idealisation and abstraction, just as ordinary science does. Second, the 
language of set-theoretic predicates is not the language of the working 
scientist. In a set-theoretic setting, he may not find some items which he 
once thought important. Often, those items are present, but only implicitly, 
and are therefore not recognised. Conversely, ‘the’ biologist may find 
additional items and concepts which he did not think of but which are 
introduced for the sake of logical completeness. Such deviations from the 
usual mode of presentation may (though need not) contribute to the 
foundational questions centering around the comparison of genetic theories. 
Third, we have tried to grasp the ‘spirit’ of the respective theories: we did 
not stick closely to the textbook presentations, rather we sometimes felt 
driven to include features corresponding to what geneticists do rather than 
what they say. For instance, the predicate ‘HERUNIT’ (Ds below) is of that 
nature. Fourth, if there are strong arguments pointing out inadequacies of 
reconstruction, we do not object to removing them by ammending the 
reconstruction. Even then, the reconstruction was valuable because it 
allowed for precise arguments pointing out its inadequacy. 


I AN ABSTRACT MATING CALCULUS 


Most readers will recall from school that Mendel performed systematic 
experiments with peas (pisum sativum) (Mendel [1901]). He studied the 
transmission of characters (appearance, shape), which may differ for the 
parental plants and their progeny from one generation to the next. One of 
the character differences studied was seed shape, whether ‘wrinkled’ or 
‘round’, others were seed colour or stem shape. Mendel hypothesised that 
there were some factors responsible for the presence of these characters. 


1 See Balzer and Sneed [1977/78]. 
2 An example to this point is found in Balzer [1984a]. 
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However, the presence of a factor for wrinkled seeds, as evidenced by the 
appearance of wrinkled seeds in a parent, did not necessitate the appearance 
of wrinkled seeds in progeny. Instead, in the situation where one or other of 
the factors contributed by the parents was that for round, the round seed 
shape would appear. Otherwise, the seeds were wrinkled. In order to 
account for the outcomes actually observed, he offered certain hypotheses 
about how different factors (‘responsible’ for different characters, and in 
different parents) could combine. By continued crossing experiments, 
Mendel came to the conclusion that factors relating to a single character 
difference, say seed colour or stem shape, segregated independently. 

In order to deal with such combinations, a formal calculus was used, 
regulating the transmission of factor combinations of two parents (parental 
populations) to factor combinations as revealed in their progeny. In a 
combination of factors giving rise to one individual (or population) several 
different factors may occur which only together yield one particular 
character of the individual (or population). This number of different factors 
is assumed to be the same for all characters of a given species and is called the 
plotdy. Factor combinations combining in the causation of one character are 
called allelic. 

Consider a simple example of individuals being described by k = 2 types 
of characters, and each character being ‘caused’ by p = 2 factors (diploid 
case). Any complete description of the two characters present in some 
individual will then involve four factors. If, for instance, the two factors a, 
B, give rise to character b, (of type 1), and the factors «2, $2 give rise to 
character b, (of type 2), then an individual being ‘characterised’ by <b,, b2) 
can be ‘described’ in terms of factors by <«,, 81; &2, 82>, and the latter 
expression is called the factor content of that individual. If a second 
individual is characterised by <c,,¢,> with factor content <7,, 613 72, 62> 
then the factor content of those two individuals’ progeny is given by some 
formal combination of the parental factor contents. Geneticists here use a 
kind of formal multiplication in the following way. The parental factor 
contents are re-written as: 


(@,+Bs)(@2+B2) and (y, +45;)(y2 +62) 


and then the well known algebraic rule is applied to the two first and second 
‘factors’, respectively, and afterwards to the two resulting expressions, t.e. 


(a, + B1) (1 +81) = (yy +181 +8371 +8151) 
(a2 + Bz) (72 +62) = (422 +%252+ Boy2 + B282) 
and 


(x11 + °° + +B151) (4272+ +e +8252) = (ayy e272 + *** + 81518252). 


Any term ‘e,£2£3£,’ in the final ‘sum’ represents some possible factor content 
of some individual occurring in the progeny. For instance, «7,427, would 
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represent some individual the character of type i of which is determined by 
the factors a, and y; (i = 1, 2). Typically, only the first and the last expression 
of this ‘calculation’ will be made explicit, e.g. 


(a, + By) (42 + Bo) (¥1 +51) (2 +52) = (81918272 + *** +B1ô1ß282). 


In our reconstruction we will use a function m in order to deal with such 
calculations. m will take two ‘factor contents’ (of the above form) as 
arguments and assign to them a formal sum of weighed ‘possible’ factor 
contents. In the special case of Mendel’s law our m in the present example 
will take the form 


mLa, Bi; 2, B2>, Yis 515 Y2 52>) 
= 1/16<a,, Yis %, Y2> ee oe 1/16<B,, 6; Bz 52>. 


Let us begin by introducing a purely formal calculus which is to represent 
the ‘calculation’ of the different possible combinations of factors which may 
arise from the mating of two ‘parental’ combinations of factors. ‘Factors’ 
here are mere symbols; biological interpretation is not foremost. The 
calculus turns out to be relatively complicated. Its full complexity, however, 
is only used in the full generality of classical genetics. Usually, 
applications of this calculus are rather simple due to the small numbers 
p (ploidy) and k (types of characters) involved as in the above example. 
Although there are alternative possibilities for formalising such calcu- 
lations,! we have chosen the present one because it seems to mirror very 
closely how geneticists actually perform their calculations. 

Some preliminary notational conventions may be helpful. 


Do (1) IN and R denote the set of natural (i.e. {1, 2, 3, .. .}) and real numbers, 
respectively. IRJ denotes the non-negative reals 
(2) ‘xey’ is an abbreviation for ‘x is a non-empty, finite set’. 
(3) if x ey and ne |N then x" denotes the n-fold Cartesian product of x with 
itself, i.e. 0" =x XxXxX XK. 
(4) if xey then the union of x, Ux, is defined as usual by Ux = {y/Ja(zex 
A yea)}. 
(5) if xey then ||x|| denotes the cardinality of x. 
(6) if x,yey, cSxxy, and aex then c(a) denotes the set 
{bE y/<a, b> Ech. 

Dx (Mating Calculus) 
Let F =<F,,..., F,> be such that all F,ey, let p, REIN, and let Wey. 
(a) VQ, F) = I] (F), elements of V(p, F) are denoted by ©. ©’, ©.. 


iKk 
(b) if O = <71,..., 7 V(b, F) we write [O] = {y1,..-, Ve}. 
(c) S(W) is the set of all formal expressions of the form 4,®,+ °°: 


1 For instance the ‘genetic algebras’ of Etherington [1939] and Woertz-Busekros [1980]. 
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+4,©, where nelN, 4,,...,a,EIRg, }, A; = 1 and for alli $ n: Qe W. 


i€n 


(d) we say that m: S(W) x S(W) > F(W) is bilinear iff for all %40; and 
EXO, E PW): MELO, £49) = VY häm On O). 
ij 


We will abbreviate ‘410, + --- +1,8,’ by ‘$, 40r, and we write ‘S (p, FY 
ia 


instead of ‘S (V (p, F))’. 

Elements of F,,..., F will be factors and elements. of V (p, F) sequences 
of sequences of allelic factors. More precisely, any element of V(p,¥) will 
be a k-tuple of p-tuples of factors, such that the factors in each p-tuple are 
allelic. Each Oe V (p, F) represents a complete description of the factors 
giving rise to some distinctive appearance of some individual or of all 
individuals of a given population. If some individual b has this appearance 
we say that @ is the factor content of b. Members of S (W) arise from those of 
W by formal ‘multiplication’ with real numbers A such that o < A < 1, and 
by formal ‘summation’ of the former elements. All stipulations of these 
definitions are purely formal in the sense of being rules for the manipulation 
of certain sequences of symbols. The symbols À and @ have a clear meaning 
in genetic theory: 4 denotes the probability of the factor content @ 
(following A in the expression 49) of some population. The symbol ‘+’, on 
the other hand, has no direct interpretation. It simply is an auxiliary device 
used in order to describe ‘possible’ combinations of factor content during 
the process of mating. Reference to this symbol could be completely 
omitted. We use it because geneticists do so, and because this yields a rather 
elegant method of ‘ x -ing out’ the different possible combinations. A formal 


sum f 40, with 6,eV(p,¥) for i<n represents just n different, 
ima 
‘weighed’ factor contents. Intuitively, if in a progenal population B there are 


n different ‘types’ t,,...,1, of individuals each type being represented by 
the corresponding factor content @; then } 1,9, will carry the information 


ia 

that the expected probability of an individual in B being of type 1; is A). 
V(p, F) can be embedded in S (p, F) in a natural way by means of O g 10. 
We will identify V(p, F) with its image under this embedding. By means of 
D1-d m-values can be ‘reduced’ to values of m for elements of W, so that m is 
determined by means of the values of the ‘base’ W for S(W). The effect of 
this will be that the m-and y-functions can be treated homogeneously in the 
sense of having the same ‘space’ (S (W)) forming its domain and range (see 
1D3-6 below and Dio-5 in part 2). 

We note that it is necessary to work with sequences of factors rather than 
with sets of factors because in some cases—e.g. in Mendelian genetics—it 
will be necessary to distinguish ‘expressions’ in which some factor occurs 
twice (or more often) from ‘expressions’ in which each factor occurs only 
once. ‘Expressions’ of the first type cannot be dealt with by simply using sets 
{o,...,Q%,} of factors. 
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2 THE CORE OF CLASSICAL CHARACTER-FACTOR 
GENETICS (CFG) 


We can now introduce the basic axioms of what we want to call character 
factor genetics. Here, studies are made of populations mating. In particular, 
the numbers of individuals with characters of different types are noted. 
Regularities are found in the proportions with which different characters 
appear in progeny, given the appearance or non-appearance of those 
characters in the parent populations. Roughly, the theory explains relative 
frequencies of characters in progeny by associating factors to each character 
and by postulating certain laws governing the combination of those factors 
during the process of mating.! 

We use the following basic notions. There is a set @ of populations, each 
Be@ being a finite set of individuals of a certain kind. Typically, 
populations are identified by means of certain characters being present in 
the population. For instance, two populations may be distinguished by 
different eye-colour. Characters, however, do not provide a definition of 
populations for during some mating experiment we may want to dif- 
ferentiate between two populations with identical characters but one being a 
‘parental’ population the other one being (part of) the progeny. In these 
cases further criteria like separation in space and (or) time or distinctions on 
the individual level (e.g. through the *-function below) are used for the 
identification of populations. It should be noted that we allow for empty 
populations, and therefore the formalism will not refer to individuals at all. 
Individuals are just the members of the sets Be @, if B # ġ. There is a set of 
possible characters S, giving the appearances of the individuals. That is, 
each individual is ‘characterised’ by some subset of S containing precisely 
those characters which are realised in (through) the individual. We will 
assume different types of characters (like eye-colour, and height) each type 
being given by a set S, of characters (of type 1). An individual then is 
characterised by one character from each such type (e.g. blue and tall, or 
brown and small). Since the set S of all characters can be defined as the union 
of the types (S = U {S,/i < k}) we do not use S as basic. Rather we start with 
a family (S;);<; of types of characters as basic and use the above definition 
when necessary (e.g. in D3-5 below). In order to state the characters 
occurring in a population, a function æ is required such that æ : @ — Pot(S). 
The power set is required to indicate that a population may have (exemplify) 
a subset of the set of possible characters. Furthermore, there is a function oss: 
B x B > Pot(@) associating with any two populations B, B'e@ their 
progeny #s(B, B’) which may be split up into several different populations 
according to the different characters present. The progeny, of course, may 
be empty. 


1 Of course, in the real world, the ‘idealised’ frequencies of the theory are subject to statistical 
fluctuations. 
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The concepts introduced.so far can be said to have a clear meaning 
independent of any genetic theory; they are ‘genetics-independent’ or 
‘genetics-non-theoretical’. This may be expressed by saying that they 
represent the ‘data’ which are given and which are to be explained by a 
genetic theory. 


D2 x is a data-structure for CFG iff there exist B, k, (|S); <,, a and ws so that 
x% = (B, (Sisk a, m) and 


(1) Bey and kelN 

(2) for all í, j < k, S,ey and if i #j then S^ S; = ¢. 

(3) a:B > Pot(U{S,]i < k}). 

(4) w: B xB — Pot(B). 

(5) for all B, B', B,, BLE: (5.1) Béss(B, B’) and B'¢ w:(B, B’). 

(5.2) (B, B’) = we(B’, B) (5.3) if B = ¢ or B = o then w1(B, B) = @. 
(6) for all dé + BEB and alli < k: ||a(B)O S;|| = 1. 


D2-5 expresses some obvious requirements satisfied by the mating function 
ws: progeny populations are different from the parental populations by 
which they are produced (5.1), the mating function is ‘commutative’ (5.2), 
and parental populations, if empty, lead to ‘no offspring’ (5.3). Note that the 
latter does not imply that each B; in #s(B, B^) is non-empty. D2-6 says that 
a-images contain precisely one character from each type S; of characters. 
That is, 2(B) gives a complete description of B by stating which character 
from each type is realised in B. 

On the theoretical level CFG uses the following concepts. There is a set F 
of factors such that, intuitively, to each character se S there corresponds 
precisely one a € F which ‘causes’ s. And in the same way as with characters 
we divide F into a family (F)i<ų of ‘types of factors’. The idea is, of course, 
that factors of type F, give rise to the characters from S,. There is a number 
p, the ploidy of the population under consideration. The picture of each 
factor yielding precisely one character just used turns out as too simple. In 
CFG it is combinations of p factors which give rise to one character. In 
accordance with our classification of characters and factors into types it is 
combinations of factors from one common F, which are responsible for 
characters from S;. Factors occurring in one common F, are called allelic, 
and we will say that a sequence (a,,..., &p is allelic iff all œ4, . . . , &p are from 
one common F,. Note that—a priori—the number of different F;—namely 
k—might have been chosen to differ from the number of S,. Our identifying 
these two numbers amounts to requiring that for each given type S, of 
characters all characters of that type are generated by one group of allelic 
factors. A generalisation of the present account with (S));<,, (F))j<, and 
k < n would leave room for different sets of allelic factors such that allelic 


1 Given the nature of the subject material, in which variation is the rule, even this is too simple. 
Sometimes more than p factors will affect a given character. However, for what might be 
considered ‘paradigm’ applications of CFG the present formulation seems acceptable. 
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combinations of factors from different such sets would create characters of 
the same type. 

Accordingly, we use a function d:F— S, ie. d {Fiji <k} >u 
{S,/i < k} (D3-5 below) which maps p-tuples of allelic factors into char- 
acters. Finally there is a function m ‘representing’ the mating function 
e via factor content on a theoretical level. Intuitively, to any two factor 
contents ©, @’ which are present in the individuals of two populations B, B’, 


there is associated a formal sum )) 4,9; representing the progeny of B and 
i€n 

P in the following way. The progeny is split up into several populations 
B,,..., B according to the characters present. Each @, is the factor content 
of one such population and each 4 is the relative frequency of individuals of 
B; (relative to the total progeny), i.e. ||B,||/||U{B,/t < n}||. Formally, for 
reasons of coherence we have blown up the domain of m to include formal 
sums of (and not only single) factor contents (D3-6). This is compensated by 
requiring m to be bilinear (D3-8). 


D3 x is a potential model of CFG (xe M, (CFG)) iff there exist B, k, (S)i<y, 
a, m, D, (Fic d and m such that 


(1) x = <B, (Sicr z, m, p, (Fdicn, dı m). 

(2) <B, (Sdi<r æ, 9) is a data-structure for CFG. 

(3) pelN. 

(4) for all i,j < k: Fyey and if i #7 then Fn Fj = ¢. 

(5) d: Uf{FR/i < k} + U{Si]i < k}. 

(6) m: Sb, F) x S(p, F) > S(p,F) is a partial function (where 
F = CF,,..-,F,)). 

(7) for alli < k and all (a@,,...,0,>€ FP: d((a1,...,4))) E Si 

(8) m is bilinear (compare Dı). 


Our treating m as partial represents a slight generalisation which proves 
fruitful for the comparison of CFG with molecular theories. D3-7 says that 
combinations of factors from one F, result in characters from the corres- 
ponding type S,. It may be noted that we have not imposed any temporal 
structure on the (theoretical and non-theoretical) mating functions ss and m. 

Such a structure which would enable us to speak about sequences of 
crossing experiments—eventually introduced as stochastic or even Markov 
processes—can .be obtained as a specialisation of our potential models. 
We did not want to overload the formalism from the beginning—it is 
complicated enough in its present version. 

Intuitively, potential models consist of ‘realizations’ of the theory’s basic 
concepts. The conditions in D3 fix the set-theoretic ‘types’ of the functions 
involved. Besides, there are some conditions which further characterise the 
functions and sets but each such condition refers to one single function in 
isolation. Such conditions do not have the properties of ‘cluster laws’ in 
which several (i.e. more than one) of the functions are characterised 
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simultaneously. By introducing the models of CFG we just have to add one 
real cluster law to the conditions of D3. 


D4 x is a model of CFG (x e M(CFG)) iff 
x= B, (Siew a, ost, p, (Foice d, my and 
(1) xe M, (CFG) 
(2) for all B, B' ¢ @ and O, O'E V(b, F): if m(@, O) = > A;B:, 


a(B) = d[Q] and a(B’) = d[@’‘] then for all B* e ae(B, P). 


|-B* || 
6S. To Ai 
Beca aop 1EB 


(2.2) if B* # ġọ then Bee Ufa ~ (dO) < n}. 


The central axiom D4-2 can be depicted as below in Figure 1. At the ‘non- 
theoretical’ level of populations, we have the experimental mating mapping 
ws, At the level of factors is the corresponding theoretical mating mapping 
m. These two levels are connected by means of d and æ which ‘meet’ at the 
level of characters. R, R and R,,...,R, denote ‘complete’ sets of charac- 
ters, ‘complete’ in the sense of containing precisely one element from each 
type S, i.e. for alli < k: |R A S;,|| = 1. The axiom then requires that the sum 
of all A,’s for which the corresponding ©, give rise to some population B* 
is just the ratio of the number of individuals of B* to the total number of 
progeny. More precisely, in the ‘sum’ 2/,@, we pick out all 4, for which 
©; gives rise to a given population B*. The sum of these numbers (left 
hand side of the equation in 2.1) has to be identical with the relative frequency 
of individuals of ‘type’ B* in the total progeny (right hand side of 2.1). 
Requirement 2.2 excludes the case that the factor content of some progeny 
population B* does not occur in the ‘image-sum’ £A,®). 


—> ~<— o 
Qo 
a 
SSS 
‘N 
N 
A 
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Given further assumptions on the form of m, the -values can be 
calculated theoretically, using m. Thus, D4-2 is a means for calculating ’s 
which will correspond to the observed or expected relative frequencies. 


3 THE INDIVIDUAL BASIS 


The predicates given so far refer to populations and not to proper 
individuals. In this respect the picture they present is incomplete. The 
behaviour and characteristics of the individuals which take part in a mating 
must also be included. Indeed, this is required not only for classical genetics, 
but for molecular genetics, which will be treated in part 2. Accordingly we 
now introduce ‘underlying’ structures dealing with the level of individuals. 
It turns out (and was intended) that functions on the level of populations can 
be defined in terms of the corresponding functions on the individual level 
(D7 below). Also, an explication of the individual basis is needed for a 
comparison with molecular genetics which entirely works at the individual 
level (see D7 below and D14-1 in part 2). 


Ds x is a hereditary unit (xe HERUNIT) iff there exist B, k, (8S): <4, 2, *, 8 
and 2 so that x = <B, (S)); <4, a, *> and 
(1) Bey and ke|N 
(2) for all i,j < k: S;ey and if i #7 then S&N Sj = ġ 
(3) a: B > Pot(U{S,/i < k}) 
(4) *: Bx B > Pot(B) 
(5) S2SB, GnQ=Pandsug=B 
(6) for all b, b’, b,, 6, eB: 
(6.1) if b*b'  @ then (beg and b' E9) or (beg and b'eg) 
(6.2) b*b' = b'*b 
(6.3) if b*b' 1b,*b, ¥ ¢ then {b, b'} = {b,,5,} 
(7) for alli < k and all be B: |la(b) o Sil] = 1. 


Here, B is a set of individuals, and (S,);<; a collection of types of characters 
as in D2. The function a corresponds to æ on the individual level. To each 
individual b there is assigned a set of characters a(b) so that a(b) contains 
precisely one element from each type S; (D5-7). The function * corresponds 
to øs. To any pair of individuals * assigns a set of individuals, namely the 
progeny of that pair. We write b*b’ instead of *(, b’). b*b' may be empty, and 
by convention we set b*b' = @ for pairs <b, b’> for which mating does not 
make sense, e.g. when b = 8’. f and Ẹ yield a partition of B into male (3) and 
female individuals. D5-6.1 says that progeny can be produced only from a 
pair of different sex, and 6.3 says that if we know the progeny we can know 
the parents. 

The individual mating function * can be used in order to define progeny 
and parental population in a given HERUNIT y. This is why we did not 
explicitly refer to these two categories in Ds. 

E 


66 W. Balzer and C. M. Dawe 


D6 If y = <B (S)ixr a, *>€HERUNIT then 
(a) PROG(y) = {be B/Ab,, b26 B(beb,*b,)} and PAR(y) = B\PROG(y) 
(b) If B,,B, S B then PROG(B,, B2) = {2/3b, € B 3b € B, (x €b,*b2)} 
(c) R is an admissible combination of characters in y(Re AD(S,, i < k)) iff 
GQ) RSU{S/i<k} and (2)foralli < k: |ROS,| = 1. 


PROG (y) is called the progeny of y, PAR(y) the parental population of y. 
Note that PAR(y) includes individuals which may have no offspring. This is 
convenient and can always be eliminated by narrowing down B to the set of 
‘real’ parental populations plus their offspring. PROG(B,,B,) is the 
offspring of populations B, and B,—as not yet classified further according 
to differences in character. D6-C just introduces a notation for ‘bundles’ of 
characters which form complete descriptions. This notion was already used 
earlier. 


D7lf «= <B,(S)i<_ 4, o> is a data-structure for CFG and y= 
<B, (SpDi<w á, *>€HERUNIT then x ts based on y iff 
(1) k = n and for all i < k: S, = $; 
(2) for all X: XE@ iff either 
(2.1) there is Re AD (S, i < k) such that X = a` HR) n PAR(y) or 
(2.2) there are R,, Ra, RE AD(S, i< k) and Y,, Y, such that Y, = 
a` (Ri) n PAR(y) and Y, = a7}(R,) n PAR(y) and X =a7"(R3) A 
PROG(Y,, Y3) 
(3) for all Be @ and be B: a(B) = a(b) 
(4) for all B, Be 8: 
(4.1) sB, B) £ ¢ iff (B #6, B + Gand BUB c PARY)) 
(4.2) if B # ọ 4 B' and Bu B' & PAR(y) then o(B, B’) = {X/X # ġ 
A ARe AD(S,,i < k) (X = PROG(B, B’) ^ a71(R))}. 


Essentially, D7 ‘reduces’ data-structures to the level of individuals. D7-2 is 
an explicit definition of @ in terms of y. B’€@ is a population iff it can be 
characterised in terms of the a-function. That is, each population consists of 
a set of individuals which all have the same characters. A further 
differentiation is made according to whether the population consists of 
‘parents’ only or of offspring. ‘Mixed’ populations are not admitted. D7-3 
and 4 contain explicit definitions of æ and #:. a(B) is given by the characters 
of any of B’s elements. The progeny populations—if there are any—of two 
parental populations B, B’ are determined as non-empty subsets of the 
progeny of B and B’ which are characterised by some admissible combi- 
nation of characters (D7-4.2). If we knew that B and B’ were populations in 
the sense of D7-2.2 then 4.2 would yield the set of all non-empty 
populations ‘in’ the progeny of B and B’. D7-4.1 is of technical nature. It 
excludes the counterintuitive situation of empty ‘parental populations’ 
producing offspring as well as of offspring producing further offspring. The 
latter requirement is technically convenient and could be avoided. Also, 4.1 
excludes cases where non-empty populations create no offspring at all. 


` 
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4 SOME METATHEORETICAL CONSIDERATIONS 


We now want to reflect briefly on some metatheoretic features as exemplified 
by CFG which play an important role in the structuralist meta-theory. On 
the one hand this will yield further ‘evidence’ for this meta-theory, on the 
other hand it will bring to attention certain problems which in scientific 
practice often are put aside too easily. 

First, the question may be asked what are the intended applications of 
CFG. What are the real systems to which CFG applies and which thereby 
are ‘explained’? We: think the answer has already been given by our 
definition of data-structures. All the concepts used in this definition have a 
clear meaning independently of CFG, and therefore may be considered as 
constituting an ‘observational language’ (relative to CFG). Real systems 
which can be described in this language will therefore count as candidates 
for intended applications. But this requirement is not sufficient. There will 
be many real systems having all properties of data-structures for CFG by 
accident. 

It is in principle impossible to give precise necessary and sufficient 
conditions characterising the intended applications of an empirical theory. 
Instead of arguing for this statement we want to contrast it with what is now 
considered a more adequate picture about the determination of the intended 
applications. The determination proceeds in two or three steps. In the first 
step, the founder of the theory points to some (very few) real systems 
(paradigms) which are successfully dealt with by his new theory. In the case 
of CFG these will be constituted by Mendel’s experiments with peas. In a 
second step it is admitted that all systems sufficiently similar to the 
paradigms also are intended applications. In the case of CFG experiments, 
crossing populations of plants (or animals) will yield systems which are 
sufficiently similar to Mendel’s paradigms. There will, however, always be 
borderline cases for which it is difficult to decide whether they ‘fall under’ 
the theory or not. Think of long term observations where mutations occur, 
or of the pattern of propagation of fern. In such cases it is the theory which 
decides. If the system under consideration on closer investigation turns out 
to be a model of the theory it will be counted as an intended application, 
otherwise not. 

Second, one may ask which of the terms of CFG are theoretical in the 
sense of being at least partially determined by CFG. Conversely, one may 
ask which of CFG’s terms are independent of CFG in the sense that CFG 
does not provide any means for their determination. If there is a distinction 
of that sort among the terms of CFG this gives some information about 
CFG’s position in the hierarchy or web of genetic or biological theories. 
CFG-non-theoretical terms (i.e. those which are not CFG-theoretical) have 
to be presupposed as being given by some theory ‘underlying’ CFG or by 
‘direct observation’ in any experimental situation. For if CFG does not and 
cannot contribute to their determination, these must be determined before 
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CFG is applied. Recently, a purely formal definition of ‘term ¢ of theory T 
being theoretical in T’ has been proposed? along the lines of intuition used 
above. A formal application of this definition is beyond the aim of this paper. 
We conjecture that of the four functions a, ws, d and m the first three are 
CFG-non-theoretical while m is CFG-theoretical. A more delicate question 
concerns the status of ploidy and factors. Intuitively, these seem to be CFG- 
theoretical, too, because they get their meaning in and through CFG. But 
the definition of theoreticity applies to functions or relations only and does 
not work for ‘naive sets’ (like F,) or ‘objects’ (like p)—as opposed to ‘proper’ 
relations or functions (like m and d). We think that CFG provides a nice 
example for further reflections on that problem. 

‘Third, there is the question of what it is precisely that is claimed by CFG 
about the world. What is CFG’s empirical claim? According to Sneed the 
empirical claim of an empirical theory is that all intended applications ‘are’ 
models of the theory. In case of CFG: all intended data-structures ‘are’ 
models of CFG. Of course, ‘are’ here cannot mean identity. The obvious 
meaning is that all intended data-structures can be extended to proper 
models of CFG by adding suitable d- and m-functions. The empirical claim 
of CFG thus would be: for all intended data-structures x there exists an m- 
function which, added to x, yields a proper model of CFG. The point of 
formulating empirical claims in this form is to avoid an explicit ‘empirical’ 
determination of the theoretical function m which often presupposes that 
CFG is already valid, and therefore is in danger of circularities with respect 
to confirmation. We refer to (Stegmueller [1976], pp. 51—94) for an extensive 
discussion. 
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Relational Holism 
and Quantum Mechanics’ 


by PAUL TELLER 


One can give a strong sense to the idea that a relation does not ‘reduce’ to non- 
relational properties by saying that a relation does not supervene upon the non- 
relational properties of its relata. That there are such inherent relations I call the 
doctrine of relational holism, a doctrine which seems to conflict with traditional ideas 
about physicalism. At least parts of classical physics seem to be free of relational 
holism, but quantum mechanics, on at least some interpretations, incorporates the 
doctrine in an all pervasive way. 


Local and Global Physicalism 

Relational Holism 

Relations in Classical Physics 

Relational Properties in Quantum Mechanics 
Conclusion 


an pW N A 


I LOCAL AND GLOBAL PHYSICALISM 


A physicalist takes the facts about a thing or things to be exhausted by the 
physical facts. Even granting a distinction between physical and other facts, 
one may ask what this claim means. Supervenience provides an attractive 
answer to this question, attractive because the answer is consistent with the 
absence of explicit reductions or definitions of the non-physical in terms of 
the physical. For example, a physicalist might claim that mental states 
supervene on brain or other bodily states, in the sense that two physically 
identical bodily states would exhibit the same mental states, even though 
these mental states might well not be definable in terms of the bodily states. 
More generally, the physicalist can say that all the facts supervene on the 
physical facts, meaning by this that whenever two real or counterfactual 
things or cases agree in all their physical characteristics, then they agree in all 
their other characteristics as well. (I want to take the sense of possibility 
implicit here in the strongest possible way, that is as logical possibility.) 
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1 This research was supported by a National Endowment for the Humanities Fellowship for 
Independent Study and Research. The work greatly benefited from the opportunity to 
present it at the 1984 meetings of the British Society for the Philosophy of Science. So many 
people have given me very substantial aid with this work that it would be impossible for me to 
mention them all. 

2 In this compressed presentation I pass over a huge number of details concerning the 
multifaceted idea of supervenience and its application in explicating physicalism. For detaile 
and references to the literature, see Teller [1984]. 
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So far I have spoken vaguely of ‘things’ or ‘cases’. One may opt here for 
either of two more clearly specified alternatives. On the first, one may 
explain physicalism as the thesis that whenever two reidentifiable, narrowly 
localisable individuals—things in the narrow sense—have the same non- 
relational physical properties, then they have the same non-relational, 
non-physical properties as well. I will call this local supervenience of the non- 
physical on the physical, or local physicalism for short. (In a while I will refine 
this characterisation.) Thus describing local physicalism depends on a 
distinction between relational and non-relational properties. Having no 
generally accepted analysis of the distinction, we will here have to make do 
with our rough and ready preanalytic understanding: A non-relational 
property is a property ‘internal’ to a thing, a property which a thing has 
independently of the existence or state of other objects. I suspect that local 
physicalism is what expresses our untutored physicalist intuitions. I intend 
local physicalism to suggest a contrast (a contrast which is not sharp) with 
the case in which supervenience is applied not to narrowly conceived 
individuals but to broader contexts, situations, or settings. 

Local physicalism faces an immediate problem, for it is not clear how it 
can deal with relational properties. Clearly, two individuals can be exact 
physical duplicates or replicas and yet can differ as to properties such as 
being a Cadillac owner or being the largest planet in a planetary system. One 
can try to get supervenience of general relational properties on the physical 
by applying the idea of supervenience to all the relational and non-relational 
physical properties holding of or among all the things entering into the 
relevant supervening relations. But it would be messy to carry out this idea 
systematically because often one must deal with a nonspecific number of 
relevant individuals entering into the relations, for example when the 
individuals are covered by an existential quantifier. So some authors appeal 
to the simple expedient of covering everything at once. They say that given 
two possible ways the whole world might be, if those two ways agree in all 
physical respects (including physical relations) then the two ways agree in all 
other respects as well. I will call this global supervenience of the non-physical 
on the physical, or global physicalism for short. Global physicalism automati- 
cally takes care of the problem of relational properties. 


2 RELATIONAL HOLISM 


I suspect that global physicalism does not express what most of us 
preanalytically understand by physicalism. Our preanalytic idea is that the 
world is fixed by the physical facts pertaining, one by one, to individual 
objects. We imagine that relational properties then arise from these 
individual physical facts. But fix the non-relational physical properties of all 
individuals and every thing has been fixed. By way of contrast, if only global 
physicalism is true we do not really have the mechanism pictured by the 
eighteenth-century metaphor of the world as a great clock. 
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Given the ubiquity of relational properties, we can have local physicalism 
only if relational properties supervene on the non-relational ones. 
Presumedly the important key is to show that physical relations and 
relational properties supervene on physical non-relational properties. 
Indeed, examples encourage optimism for this method of getting global 
physicalism to collapse back down to local physicalism. The relation x is 
longer than y supervenes on the non-relational properties of x and y, 
specifically on their length. The same kind of thing goes for ts heavier then, 
has the same charge as, and other physical relations. The hope is that once all 
the non-relational physical properties of objects are set, then so are all the 
relational physical properties and so then all the non-physical properties and 
relations as well. This approach to relations really involves a strengthening 
of the notion of local physicalism: Hence forth, by local physicalism I will 
mean the view that all the non-relational properties of an individual a 
supervene on a’s physical non-relational properties; and any relations 
holding among individual relata a, b, c, . . . supervene on the non-relational 
properties of the relata a, b,c,.... 

The present paper suggests that the hope of the local physicalist will be 
frustrated. I will appeal to a widespread class of plausible cases (well known 
in the context of other problems) in which collections of objects have 
physical relations which do not supervene on the non-relational physical 
properties of the parts. If we are disappointed as local physicalists the 
problem offers us a kind of consolation prize, for the very statement of the 
problem provides help with a previously intractable issue for analytic 
philosophers. Holism has always seemed incoherent, for it seems to say that 
„two distinct things can somehow be entangled or intermeshed so that they 
are not two distinct things after all. Yet apparent unintelligibility does not 
prevent holism from recurring, not only in the work of philosophers of East 
and West, but also in what quantum mechanics seems to many of us to be 
saying about the world. The statement of local physicalism’s failure suggests 
a reading which we can give to holism which analytic philosophers ought to 
find relatively clear. By relational holism I will mean the claim that objects 
which in at least some circumstances we can identify as separate individuals 
have inherent relations, that is, relations which do not supervene on the non- 
relational properties of the distinct individuals. Relational holism is free of 
the incoherence which threatens less clearly stated forms of holism. It is 
sufficient for an object to be a distinct individual that it have a non-relational 
property. And it is quite consistent to suppose that two such distinct 
individuals, each having a non-relational property, should also stand in 
some inherent relation to each other. The failure of local physicalism leaves 
us with a kind of holism, but at least it is a holism we can understand. 


1 I learned this idea from an unpublished manuscript by Jaegwon Kim. A similar idea may be 
found in Leibniz’s view that there are no ‘purely extrinsic denominations’. Parkinson [1965], 
Pp. 42-5 provides a secondary source. 
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3 RELATIONS IN CLASSICAL PHYSICS 


Does classical physics provide any really clean examples of inherent 
relations? I can only partially examine this question here. But I want to 
touch on two examples from classical physics because I feel they are needed 
for a systematic view of the topic, because they provide a valuable contrast 
with the case of quantum mechanics, and because I hope others will think 
about some of the questions which come up along the way. 

Consider the case of classical mechanics’ point particles under the 
influence of gravitational forces. In some sense, what is true about one 
particle depends on all the rest since the acceleration, or force, experienced 
by one particle depends on where the others are. We customarily gloss this 
situation without appearing to appeal to inherent relations. We say that each 
particle has its full complement of properties which we unreflectively take to 
be non-relational properties such as position, mass, velocity, and accelera- 
tion; and we say that changes in these properties are caused by the 
circumstance of other particles at other places having similar properties. 
This gloss has a price, of course. It involves action at a distance, which 
philosophers and physicists since Newton have found mysterious or even 
nonsensical. (I want to put aside for the moment the other obvious problem 
with this gloss, the relationality of position and velocity.) 

Classical mechanics gives us an alternative to action at a distance. Objects 
have potential energy, and one can say that it is local changes in potential 
energy, not distant causes, which produce change in velocity. It will help 
comparisons which follow if we give more detail on how appeal to potential 
energy works. Technically, a differential equation relates rate of change of 
potential energy in space to rate of change of velocity in time. By recasting 
this statement informally, we can see how a differential equation can give an 
account of change without appealing to action at a distance. Think of time 
and space broken up into very small cells. It is the fact that the potential 
energy of a particle in a spatial cell differs from the potential energy it would 
have in an immediately neighbouring cell that is responsible for a difference 
in velocity between the present and the next temporal cell. Or, we can take 
the potential to be properties of the cells (the potential energy which a 
particle of unit mass would have were it in the cell). Then change in velocity 
of a particle rests on the way the potential energy changes between one cell 
and its neighbours. We can take these cells to be arbitrarily small, which 
gives us a sense in which no action at a distance has been involved. 

However, phrasing physics in terms of potential energy does not 
automatically avoid relationality. Potential energy, properly understood, is 
a relational property. Instead of saying that each of two particles causes the 
other to have a potential energy, mysteriously, instantaneously, and at a 
distance, we should say that the potential energy is a relation that holds 
between the two particles. We know that in general properties which we 
appear to attribute non-relationally to individuals often really are relational 
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properties holding between several individuals, as when we speak of my 
being married as a property I have instead of as a relation which holds 
between me and my wife. Similarly, talk in classical mechanics about the 
potential energy of a particle is really implicit talk about the mutual potential 
energy which holds between particles. So when we customarily phrase an 
explanation of change of motion in terms of the potential energy which a 
particle does or would have in a cell, we must keep in mind that these 
potential energies are really relational. We are concerned here with potential 
energies relative to distant bodies, that is with the potential energy relation 
which holds between the particle or cell in question and the other particles in 
the world. 

In sum, classical mechanics can represent the springs of change in two 
prima facte very different ways: in terms of action at a distance or in terms of 
the relational property of potential energy. While the two accounts yield the 
same phenomena, one wonders whether, or in what way, the two accounts 
are equivalent at some deeper level. 

But is the relation of potential energy inherently relational? That is, is it 
the case that potential energy does not supervene on the non-relational 
properties of its relata? Classical physics calculates potential energy as a 
function of the masses of and the spatial separation between particles, and 
then describes an association between potential energy and the way particles 
change their state of motion. To determine the kind of relationality involved 
we need to pin down the nexus of contingency in this overall description. As 
a first alternative one could locate the contingency in the connection 
between masses, separation, and potential energy, taking the bearing of 
potential energy on change of motion to be necessary; or one could take the 
connection between masses, separation, and potential energy to be necessary 
and take the regularity connecting potential energy and change of motion to 
be contingent. On the latter reconstruction we can take potential energy to 
supervene upon mass and spatial separation, thus giving a local physicalist 
the best chance of presenting classical physics as underwriting a local 
physicalist world view. Even so, potential energy supervenes on mass and 
spatial separation, and we learn from relativity that there is no absolute 
space, the points of which (together with mass) might serve as a super- 
venience base on which potential energy could supervene. Though we are 
less clear about how to analyse causes acting at a distance, they similarly 
depend on masses and spatial separations, so their supervenient status will 
be at best no better than that of potential energy. It is possible, however, that 
the relationality of space can be circumvented. While spatial points are 
inescapably relational, one can work instead with full fledged space-time 
points. Philosophers continue to debate relationality of space-time in the 
form of debate between relationalism and substantivalism, an issue which 
one cannot settle on straightforwardly physical grounds as one can in 
showing the relationality of space. (I will take up the issue of the absolute vs. 
the relational status of space-time in Teller [to appear, a]). 


76 Paul Teller 


A second example, that of point particles under the influence of 
electromagnetic forces in relativistic electrodynamics, develops the field 
concept nascent in the idea of potential energy, giving fields a much more 
substantive role. Electromagnetic fields—light is an example—have energy 
and momentum, and propagate on their own through space-time. Because 
the fields themselves propagate, electromagnetism is free from the begin- 
ning from action at a distance and inherently relational properties. Again, a 
differential equation describes change, this time change in the electro- 
magnetic fields. Differences in the fields in neighbouring spatial cells govern 
changes in fields between neighbouring temporal cells, so that changes in 
the field move, or propagate, through space-time rather like ripples on the 
surface of a pond. Propagating electromagnetic fields then serve as the 
vehicle by which distant particles affect each other. A particle at point a can 
affect the fields present at a. This change then propagates through space- 
time to b where the change in fields affects a particle at b. Action at a distance 
is replaced in a thorough going way by relations between infinitesimally 
separated neighbours, which relations are in turn plausibly taken to 
supervene on the non-relational properties of the relata. Neither action at a 
distance nor distant spatial separation threaten to enter the picture to spoil 
the idea of the world working as a giant mechanism, understandable in terms 
of the working of the individual parts. To be sure, we have had to liberalise 
our notion of ‘part’ to include field quantities at space-time points. But in 
supporting the conception of local physicalism, relativistic electrodynamics 
counts as at least as classical as classical mechanics. 


4 RELATIONAL PROPERTIES IN QUANTUM MECHANICS! 


A massive reentry of inherent relations provides one mark of the sharply 
non-classical nature of quantum mechanics. Inherent relationality infects 
classical mechanics and special relativity at worst in the relationality of 
space-time, if space-time is inherently relational. But inherently relational 
properties inundate quantum mechanics, at least if we take the state function 
to attribute properties to individual micro systems. 

To understand how inherent relations get into quantum mechanics we 
need to understand superimposed properties. To set ideas I will begin witha 
graphic statement, warranted only on certain interpretations of the state 
function. A quantum mechanical system can have certain properties, such as 
an exact position, x4, an exact position, x2, an exact momentum, f,, or an 
exact momentum, P,. (To smooth exposition I pretend that position and 
momentum are discrete quantities. This simplification distorts some of the 
1 From one perspective I do little more in this section than restate an idea put forward in a 

number of places by Allen Stairs, but most fully in his [to appear]. Stairs in turn takes an 
important lead from Shimony [1978, 1980], and one should also mention d’Espagnat as an 
important source for these ideas e.g., [1973]. I hope that by using the notion of supervenience, 


and by drawing connections with some more traditional themes, I have succeeded in throwing 
a little additional light on ideas which are not in any way original with me. 
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mathematical facts of the theory but does not affect any of the conceptual 
points at issue.) Superposition is a way in which properties of one kind, say 
position, can ‘combine’ to form properties of another kind, such as 
momentum. The two exact positions, x; and x2, can combine or ‘super- 
impose’ to make a new property, say p,, which is distinct from both of the 
original superimposed properties, x; and x2, and indeed from all exact 
positions. 

A more careful statement of superposition also clarifies what super- 
position involves. Quantum mechanics delivers its information about 
objects or systems in terms of mathematical descriptions called ‘state 
vectors’ or ‘state functions’ which are linked to observational results, such as 
observed positions or momenta, in terms of probabilities. A system 
characterised by a given state function will reveal a given observational 
characteristic with a probability which the state function specifies by a 
simple algorithm. In special cases these probabilities will be 1. One state 
function will give an object a probability of 1 for having exact position x, a 
second state function will give an object a probability of 1 for having position 
x2, a third a probability of 1 for p,, a fourth a probability of 1 for p}, and so 
on. These special state functions are called eigenstates (or proper states) for 
the properties assigned a probability of 1. For the moment I want to 
understand eigenstates, and state functions generally, just as mathematical 
descriptions and stay neutral as to how they should be interpreted. 

The idea of superposition enters because state functions can be numeri- 
cally added, and when we add two state functions together we are doing 
something just like adding together the description of two wave processes 
and getting a description of anew wave process, where the new wave process 
is the one obtained by superimposing the two wave processes described by 
the original state functions. The superposition of state functions then 
applies to properties, such as the ‘superposition of two exact positions’ as 
follows: If we start with an eigenstate for the position x, and a second 
eigenstate for position x, and then add them together we get a new state 
which is an eigenstate for momentum p, and which is not an eigenstate for 
any exact position. 

How we should understand this more careful statement of superposition 
of properties depends on how we should interpret state functions. Let me 
illustrate with a strong interpretation. Suppose we say that a system has a 
property if and only if it is in an eigenstate for that property. Suppose we also 
take state functions, including eigenstates, to be referring expressions, 
which refer to actual states of the systems to which they apply. In particular, 
we will take eigenstates to refer to the properties specified by the eigenstate, 
that is, the property which an object has just in case it is in the corresponding 
eigenstate. Under these assumptions the careful statement of superposition 
provides a way of understanding the initial strong statement by giving us a 
clean connection between eigenstates and their associated properties: To say 
that momentum p; is a superposition of positions x, and x, is just to say that 
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the eigenstate referring to p; is the superposition (the sum) of the eigenstate 
referring to x, and the eigenstate referring to x2. 

However, we are not forced to embrace such a strong interpretation. At 
the opposite extreme we have an instrumentalist interpretation of the state 
function which casts the state function as a mere verbal mechanism, an aid in 
predicting observational results, but not a referring or otherwise meaningful 
expression. On an instrumentalist interpretation we have no reason to think 
that the world exhibits these surprising properties which are interconnected 
by superposition. Since inherently relational properties enter quantum 
mechanics as special kinds of superposition, and since instrumentalism 
takes quantum mechanics to have no superimposed or superimposable 
properties, inherently relational properties are not inevitable in quantum 
mechanics. But I am very reluctant to settle for instrumentalism. In what 
follows I will lay instrumentalism aside and restrict attention to what 
happens when we take state functions to describe individual systems in a 
non-instrumentalistic way, so that there is some combination of or 
systematic connection between properties or states of real objects as 
described by the formal manipulation of adding state functions. (Present 
questions become more complicated if we give an ensemble interpretation to 
the state function. I will extend the present discussion to ensemble 
interpretations in Teller [to appear b].) 

Even on a non-instrumentalist interpretation, superposition does not yet 
indicate anything about relational properties. In the example I cited two 
properties of an object may superimpose. But the new superimposed 
property is simply another non-relational property of the same object. In 
order to get relational properties we must look at the superposition of 
properties of groups of objects taken together. Let’s consider two objects, a 
and b; and let’s consider their properties, which I will indicate generically 
with the letter ‘w’. A superscript on ‘w’ tells us which object is in question 
and a subscript indicates the specific property. Thus wi indicates the 
circumstance of a having property w,, wb indicates the circumstance of b 
having property w, and so on. I will also use the same expressions to 
indicate the associated eigenstates. Thus w{ indicates the eigenstate in which 
a has property w, with probability 1, etc. Next we consider the properties of 
each of a and b when a and bare taken together. For example, wiw} indicates 
the circumstance of a having w, and b having w2; and this same expression, 
wiw}, also indicates the compound eigenstate in which a is sure to have w, 
and 6 is sure to have w,. While wiw} might be taken to be a relational 
property, it is not a candidate for an inherently relational property. Exactly 
the same goes for the distinct compound property or eigenstate wiw'. But 
given these two compound properties, quantum mechanics tells us that their 
superposition, wiw? + wiz, is also a property which the pair, a, b, can have. 
This property is a relational property which holds between a and b, or of a 
and b collectively; and except in degenerate cases this property does not 
reduce to or supervene upon non-relational properties of a and b taken 
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separately. This conclusion follows immediately under the assumption of 
the strong interpretation, that a system has a property if and only if it is in the 
corresponding eigenstate. With this assumption the conclusion is an 
immediate consequence of the fact that for alli and 7 (excluding some special 
degenerate cases) states wiw? and wiz} +w3w? are distinct so that a system 
in one is never in the other. The conclusion can also be argued without the 
assumption that if a system has a property then it is in the corresponding 
eigenstate. The argument is not at all difficult, but as it requires departing 
from the non-technical spirit of this paper, I postpone it until a future more 
detailed, technical exposition of these ideas [Teller, to appear b].! The 
argument I have in mind still assumes the other half of the strong 
interpretation, that eigenstates attribute a corresponding eigenproperty to a 
system. What happens if this assumption is given up while retaining the idea 
that individual systems still have ‘hidden’, and in general dispositional 
properties? Michael Redhead has some very interesting suggestions con- 
cerning this possibility which I hope he will publish soon. 
Not all such quantum mechanical inherently relational properties have 
clear cut physical significance. But many do; and-more importantly in at 
least some cases the inherently relational properties, such as w3w3 + ww}, 
are independently ascertainable properties of pairs of systems and not just 
theoretical phantasms of quantum mechanics. The example of Einstein, 
Podolsky and Rosen is a superposition of (actually infinitely many) exact 
positions of a and of b which constitutes an eigenstate of the distance 
between a and b, and also an eigenstate of the total momentum of a and b, but 
not of any specific positions or momenta of a or of b. Postulating ‘hidden 
states’ does not provide a supervenient base for the EPR relational property. 
Even if one assumes that quantum mechanics is incomplete and that, in 
addition to the properties attributed by the state function, a and b each have 
definite positions and momenta not predicted or described by quantum 
mechanics, the EPR relational property does not supervene on the 
postulated non-relational properties of the individuals. Exactly the same 
kind of thing goes for the example used by Bell. In this case we consider two 
particles with the non-classical property of spin, and we construct a certain 
simple superposition of two non-relational spin states of a and of b. In this 
superposition quantum mechanics does not assign a definite spin to either 
particle in any direction, and in analogy to the last case, even if one 
postulates specific spins for particles? in the superimposed state, the 
1 If we restrict attention to hidden states such as position, momentum, and spin the point is 
immediate; for example in correlated spin systems z-spin up on system 1 and z-spin down on 
system 2 is consistent with many different quantum states of the whole system. Ruling out 
more general ‘hidden variables’ as a possible supervenience base for inherent relations 
requires the ‘no hidden variable’ arguments of Bell, Kochen and Specker, and others. These 
seem to show that hidden variables must be ‘contextual’ which comes to the same things as 
being relational. 

2 In view of Bell’s theorem, most people conclude that such a postulation of hidden spin states 


requires violation of some sort of locality principle. The notion of ‘locality’ involved is only 
quite indirectly connected with the notion of ‘locality’ in ‘local physicalism’. 
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property described by the superposition does not supervene on the 
postulated individual properties. Once again the superposition characterises 
an independently identifiable property with distinctive experimental impli- 
cations for the a—b system as a whole. 


5 CONCLUSION 


Earlier I suggested that inherent relations provide an intelligible formu- 
lation of holism, which I dubbed ‘relational holism’. This idea now applies 
to quantum mechanics. The strange ways in which things seem inter- 
dependent in quantum mechanics has often suggested holism to inter- 
preters, but they have been reluctant or uneasy about embracing the holism 
because of its obscurity. The suggestion of applying relational holism to this 
problem gives a proposal for further clarification and critical examination. 
Quantum mechanics describes individuals which, often at least, we can 
distinguish from each other. But these distinguishable individuals can also 
have inherent relations. The work to be done includes a closer examination 
of whether we have a specifiable and acceptable account of distinguishable 
individuals which is consistent with the claim that such individuals may also 
have inherent relations. One worry I have in mind here concerns possible 
problems with quantum statistics for identical particles. It is possible that 
these quantum statistics can themselves be clarified by applying the idea of 
inherent relations.! 

Inherent relations are also important to interpreting quantum mechanics 
because they play a central role in many of the theory’s puzzling features. 
They are the relations which hold between measurement devices and 
measurement objects and which generate the problem of measurement. 
They are the relations which give rise to the non-classical statistics which 
violate Bell’s inequality. And so on. All such cases involve superposition 
manifested as inherent relations. Seeing that inherent relations are at the 
heart of these puzzles does not by itself lead to their resolution. At best this 
insight will facilitate a partial restatement. But given our state of bewilder- 
ment, restatement may be valuable. 

Most importantly, inherent relations give us an improved view of 
quantum mechanics’ break with classical intuitions, or at least an improved 
view of an important part of this break. I suggested earlier that we 
understand classical physicalism as local physicalism, that is as the 
supervenience of all the facts on the non-relational physical facts about 
physical objects. If we reject instrumentalism and take state functions to 
ascribe properties to and relations among objects, quantum mechanics tells 
us that physicalism can at best come to supervenience on inherent relations 
and that the world is a more deeply intermeshed web than we thought. 
Indeed, according to quantum mechanics, the extent of entanglement 


1 Bas van Fraasaen’s [1984] discussion of quantum statistics may bear on this point. 
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through inherent relations is all pervasive. Each and every scattering 
interaction gives rise to inherent relations, so that every non-isolated object 
gets caught up with other objects in the web of the inherently relational. One 
of quantum mechanics’ basic puzzles is how these inherent relations connect 
with non-relational properties. The step we may need to take to advance our 
physical theory and our conceptual scheme for the physical world may be to 
come to terms with inherent relations and to understand how they give rise 
to (or come to be seen as) the non-relational properties which have so far 
formed the basis of our physical world view. 
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A New Approach to 
Quantum Logic 


by J. L. BELL 


The Origins of Quantum Logic 
Propositional Logic as a Logic of Attributes 
The Manifestation of Attributes 
Introducing Implication 

Superposition of States 


aot & ND 


The idea of a ‘logic of quantum mechanics’ or quantum logic was originally suggested by 
Birkhoff and von Neumann in their pioneering paper [1936]. Since that time there has been 
much argument about whether, or in what sense, quantum ‘logic’ can be actually considered a 
true logic (see, e.g. Bell and Hallett [1982], Dummett [1976], Gardner [1971]) and, if s0, how it 
is to be distinguished from classical logic. In this paper I put forward a simple and natural 
semantical framework for quantum logic which reveals its difference from classical logic in a 
strikingly intuitive way, viz. through the fact that quantum logic admits (suitably formulated 
versions of) the characteristic quantum-mechanical notions of superposition and incompatibility 
of attributes. That is, precisely the features that distinguish quantum from classical physics also 
serve, within this framework, to distinguish quantum from classical logic. Some light is shed on 
the question of whether quantum logic is a genuine logical system by introducing a natural 
entailment relation for quantum-logical formulas with the implication symbol. The novelty is 
that, although implication behaves as it should (i.e. the ‘deduction theorem’ holds), the order of 
introduction of premises is significant. The fact that a reasonable entailment relation can be 
formulated for quantum logic supports the view that it is a genuine logical system and not 
merely an algebraic formalism. 


The paper is organised as follows. We begin with an account of the origins of 
quantum logic, based on Birkhoff and von Neumann [1936]. In §2 a 
common semantical framework for intuitionistic, classical and quantum 
logic is formulated, employing the notion of an attribute over a space with a 
distinguished lattice of subsets (this framework was first introduced in Bell 
[1983]). In §3 we define the central concept of mantfestation of attributes and 
employ it to distinguish (intuitionistic and) classical logic from quantum 
logic. In §4 we introduce the logical operation of implication and show how 
the extension of the concept of manifestation to implication formulas leads 
both to general notions of superposition and incompatibility characteristic of 
quantum logic, and to the entailment relation mentioned above. Finally, in 
§5 we observe that the concept of superposition introduced here satisfies the 
conditions originally laid down by Dirac [1930] and that, interpreted within 
the ‘orthodox’ framework for quantum mechanics, it coincides with the 
usual notion of superposition of states. 
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Readers not familiar with the mathematical formalism of quantum 
mechanics may omit sections 1 and 5 without substantial loss. 


I THE ORIGINS OF QUANTUM LOGIC 


Let S be a classical physical system and let £ be its phase space. We may 
regard an observable on F as being a function f: 2 —> Q where the codomain 
Q, the observation space of f, is the set of ‘values’ that f can assume. (‘Typically 
Q will be a set of real numbers.) If f,,...,f, are observables on S with 
observation spaces Q,,...,Q,, the observation space associated with the n- 
tuple of observables (/,,...,/,) is the Cartesian product Q,x,...,xQ,. 
Each subset X of Q, x ... x Q,,is correlated with a proposition Py concerning 
the state of x of 7, namely the assertion that the m-tuple of measured values 
of f,,...,f, lies in X when J is in state x. X has a representative X in X 
defined by 


X = {xE E: (fi), fa(x)) eX}. 


Thus £ is the set of states x of Z such that Py is verified when F is in state x. 
Accordingly we may also call X the representative in £ of the associated 
proposition Py. 

Notice that the relation of entailment between propositions corresponds to 
the relation of set-theoretic inclusion between their representatives; that the 
representative of the negation of a proposition is the set-theoretical 
complement (in E) of its representative; and that the representative of the 
conjunction of two propositions is the set-theoretical intersection of their 
representatives. It follows that the logic of propositions concerning a 
classical system ¥ is isomorphic to a Boolean algebra of subsets of the phase 
space of S. 

Turning now to the case of compatible observables in a quantum system, 
we find that the situation is broadly similar. Thus let 2 be a quantum system, 
H its phase space (Hilbert space) and A,,..., A, compatible observables on 
2, i.e. commuting self-adjoint linear operators on H. (For simplicity we shall 
assume that the eigenvalues of A,,..., A, are discrete and nondegenerate.) 
Since A,,...,A, commute, H has a basis (bo, b4, . . .}) consisting of common 
eigenstates for the A;. For each i= 1,...,n; J =0,1,2,... let Aj be the 
eigenvalue of A; corresponding to the eigenstate b,. Then for each 1, the set 
{Aj j =0,1,2,...} lists all the possible values that the observable 4; can 
assume, and may accordingly be regarded as being the observation space of 
A; 

What is the observation space for the n-tuple of observables (fi, ..., fa)? 
To determine this, let (k,,...,%,) be an n-tuple of natural numbers, and 
suppose that we are certain to get the result (Aj,,..., 42) by simultaneously 
measuring A,,...,A,. The only state of 2 in which we are certain to get the 
result Jj, by measuring A, is (up to a scalar factor) by, So we are only certain 
to get the result (Aj,,..., 4z,) by measuring (A,,...,A,) when all the b,, are 
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identical, i.e. when k; = ky =+ +t =,. (To put it another way, if 2 is ina 
state such that a measurement of one 4, is certain to give the result ri then a 
measurement of any A, is certain to give the result Ai for the same j.) It 
follows that the observation space for (A,,...,A,) is the set of n-tuples 
{(Aj,..., A:j = 0, 1, 2,...}. Since the elements of this observation space are 
indexed by the set N of natural numbers, each subset X of N is correlated 
with a proposition Py concerning the state x of 2, viz., the assertion that the 
n-tuple of measured values of A,,...,A, is certain to be in the set 
{(Aj,..+,4%):7 €.X} when 2 is in state x. For each subset X of N we define the 
representative X of Py (or X) to be the closed subspace of H generated by the 
set {b,: ne X}: this is natural since the elements of £ are precisely those 
states x of 2 such that the proposition Py is verified when 2 is in state x. 

In this case, too, we find that the relation of entailment between 
propositions corresponds to set-theoretical inclusion of their representa- 
tives and that the representative of the conjunction of two propositions is the 
set-theoretical intersection of their representatives. However, the rep- 
resentative of the negation of a proposition is no longer the set-theoretical 
complement, but rather the orthogonal complement of its representative. 
Nevertheless, it still follows that the logic of propositions involving 
compatible observables on a quantum system 2 is isomorphic to a Boolean 
algebra of (closed) subspaces of the phase space of 2. 

So far so good. The difficulty arises when we try to extend the analysis to 
incompatible (i.e. non-commuting) observables: since non-commuting oper- 
ators have no common eigenbasis, the whole procedure collapses. Thus, for 
example, given two incompatible observables A, B, we can perfectly well 
form the observation spaces of A and B separately and then consider the 
representatives in H of propositions involving only A and propositions 
involving only B. But we have no way of representing propositions involving 
both A and B, e.g. the conjunction of propositions of the above sort. In their 
original paper [1936], Birkhoff and von Neumann propose to remove this 
obstruction by postulating that the intersection of the representatives of any 
pair of propositions—even those involving incompatible observables—is 
still the representative of some proposition, namely the ‘conjunction’ of the 
pair. (Of course, this is already the case for propositions involving only 
compatible observables.) As they point out, the simplest (if Procrustean) way 
of ensuring that this postulate holds is to assume that all self-adjoint 
operators on H are (or correspond to) observables. In this event, every 
closed subspace of H is the representative of a proposition and the 
(ortho)lattice of closed subspaces may then be regarded as the mathematical 
embodiment of a ‘logic’ of propositions, the so-called quantum logic. 

The problem with this approach is that, while the mathematical meaning 
of the operations of intersection and orthogonal complementation on the 
subspaces of H is perfectly clear, the logical meaning of the corresponding 
operations of ‘conjunction’ and ‘negation’ on the associated propositions is 
not. Thus arises the fundamental problem of meaning of quantum logic. 
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My attempt ro resolve this problem will hinge on two things: the 
replacement of Hilbert spaces by more perspicuous structures, the so-called 
proximity spaces, and the analysis of propositional quantum logic in terms of 
the concept of an attribute defined over such a space. We turn to this now. 


2 PROPOSITIONAL LOGIC AS A LOGIC OF ATTRIBUTES 


Let us think of attributes or qualities like ‘blackness’, ‘hardness’, ‘having 
positive charge’, etc. as being possessed by or manifested over parts of a space 
(sometimes called a manifold or field). For instance, if the space is my 
sensory field, part of it manifests blackness and part manifests hardness and, 
e.g., a blackboard manifests both attributes. Each attribute is correlated 
with a proposition (more precisely, a propositional function) of the form: 
‘___ has the attribute in question.’ 

We shall use symbols A, B, C, to denote attributes. We assume that we are 
initially provided with a supply of atomic or primitive attributes, i.e. 
attributes not decomposable into simpler ones. For each such attribute A 
and each space S we also consider as given the total part of S which manifests 
A; this will be called the A-part of S and denoted by [A]s. (Thus, for 
instance, if S is my sensory field and A is the attribute ‘red’, then [A ]g is the 
total part of S that is coloured red: the red part of S.) 

Attributes may be combined by means of the logical operators A (and), V 
(or), — (not) to form compound or molecular attributes.’ The term ‘attribute’ 
will accordingly be extended to include compound attributes as well as 
primitive ones. It follows that (symbols for) attributes may be regarded as 
the formulas of a propositional language £ —the language of attributes—and 
we shall use the terms ‘attribute’ and ‘formula’ synonymously. 

In order to be able to correlate parts of any given space S with compound 
attributes, i.e. to be able to define the A-part of S for compound A, we need 
to assume the presence of operations A, V, æ (corresponding to A, V,—4) 
on the parts of S. For then we will be able to define the A-part [A ]s of the 
space for arbitrary attributes 4 by recursion on the number of logical 
operators in A according to the following scheme: 


[A A B]; = [Als A [Bls 
[4 V Bls = [A]s V [B]s (2.1) 
[-A]s = ([4]s)* 


([A ]g is also called the value of A in S.) Once this is done, we can then define 
the basic relation Fg of entailment or inclusion between attributes over S: 


AFsB iff [A]s<[B]s. 


Now the conventional meaning of ‘ A’ dictates that, for any attributes 4 


1 Note that ‘+’ (implication) is for the moment omitted. We make up for this deficiency in §4. 
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and B, we should have A A BEA, A A BEgB and, for any C, if CE,A 
and CE, B then CE, A B. In other words, [4 A Bl should be taken to 
be the largest part (with respect to set-theoretic inclusion) of S included in 
both [A ]s and [B ]s. By the first equation of (2.1), the same must then be true 
of [A], A [B]s. Consequently, for any parts U, V ofS, U A V should be the 
largest part of S included in both U and V. 

Similarly, now using the conventional meaning of ‘ V °’, we conclude that, 
for any parts U, V of S, U V V should be the smallest part of S which 
includes both U and V. 

We suppose that ‘—,’ satisfies the law of ex falso quodlibet: thus if A is an 
attribute, then AA—AfgB for any B. In other words [A 
A—Al]s £ [B]s or, using (2.1), [A]s A [4]} < [B]s for any B. If we 
assume that there is a vacuous attribute B for which [B], = ¢, the empty 
part of S, it follows that [A], A [A] = ¢. Consequently, for any part U of 
S we should require that U A U* = 9, i.e. that U and U* be ‘mutually 
exclusive’. 

It follows from these considerations that we should take the parts of a 
space § to constitute a lattice of subsets of (the underlying set of) S, on which 
is defined an additional operation æ (‘complementation’) corresponding to 
negation (or exclusion) satisfying the condition of mutual exclusiveness 
mentioned above. Formally, a lattice of subsets of a set S is a family L of 
subsets of S containing ¢ and S such that for any U, V e L there are elements 
UAV, UV VeL such that U A V is the largest (with respect to ©) 
element of L included in both U and V and U V FP is the smallest (with 
respect to ©) element of L which includes both U and V. U A V,U V Vare 
called the meet and foin, respectively, of U and V. A lattice of subsets of S 
equipped with an operation #: L > L satisfying U A U* = @¢ for all UeL 
will be called a #-lattice of subsets of S. 

We can now formally define a space to be a pair S = (S, L) consisting of a 
set Sand a #-lattice L of subsets of S. Elements of L are called parts of S, and 
L is called the lattice of parts of S. 

In practice we shall only need to consider the following sorts of space, so 
henceforth the term ‘space’ will connote one of the following 3 kinds: 


(1) Topological spaces. In this case S = (S, L) is a set S equipped with a 
topology L. Here the meet and join operations in L are just set-theoretical 
intersection and union, and the æ operation is given by U* = interior of S 
— U, for UeL. 

(2) Discrete spaces. These are the special cases of (1) in which the topology 
Lon S is the family PS of all subsets of S. The #-operation on L is then just 
ordinary set-theoretic complementation in S. 

(3) Proximity spaces. A proximity structure is a set S equipped with a 
proximity relation, i.e. a symmetric reflexive binary relation ~. (The reason 
for using the term ‘proximity’ is, as we shall see, that it is helpful to think of 
x & yas meaning that x is near y. Caution: % is not generally transitive!) For 
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each xe S we define the quantum at x, Q,, by 


QO, = {ye S: x & y}. (2.2) 


Unions of families of quanta are called parts of S; thus a part of S is a subset 
of the form 


U Qx 


xeA 


for A S S. It can be shown (see Bell [1983]) that the family Part(.S) of parts 
of S forms a #-lattice! of subsets of S, in which the join operation is set- 
theoretical union, the meet of two parts of S is the union of all quanta 
included in their set-theoretical intersection, and, for Ue Part(S), 


U* = {ye S: Ix U: x & y}. (2.3) 


The pair S = (S, Part(S)) is called a proximity space. 

Observe that any discrete space is a proximity space in which % is the 
equality relation. More generally, it is quite easily shown that a proximity 
space S is a topological space if and only if its proximity relation is transitive, 
and that in this case S is almost discrete in the sense that its lattice of parts is 
isomorphic to the lattice of parts of a discrete space. 

Proximity structures (or spaces) S admit several interpretations which 
serve to reveal their significance. 


(a) S may be viewed as a ‘space’ or field of perception, its points as 
locations in it, the relation % as representing the indtscernibility of locations, 
the quantum at a given location as the minimum perceptibilium at that 
location, and the parts of S as the perceptibly specifiable subregions of S. 
This idea is best illustrated by assigning the set S a metric 6, choosing a fixed 
e > o and then defining x ~ y +> ĝ(x, y) < 8. 

(b) S may be thought of as the set of outcomes of an experiment and % as 
the relation of equality up to the limits of experimental error. The quantum at 
an outcome is then the ‘outcome within a specified margin of error’ of 
experimental practice. 

(c) S may be taken to be the set of states of a quantum system and s % tas 
the relation: ‘a measurement of the system in state s has a non-zero 
probability of leaving the system in state ż, or vice-versa.’ More precisely, we 
take a Hilbert space H, put S = H— {o}, and define the proximity relation % 
on Sbys % te <s, tò # o (s is not orthogonal to ż). It is then readily shown 
that the -lattice of parts of S is isomorphic to the #-(ortho)lattice of closed 
subspaces of H. Consequently, #-lattices of parts of proximity spaces include 
the #-lattices of closed subspaces of Hilbert spaces—the lattices associated with 
Birkhoff and von Neumann’s ‘quantum logic’. This observation will be 
employed later on. 


1 Actually Part(S) has the structure of a complete ortholattice (see Bell [1983] or Birkhoff [1960]) 
for we have, for any U,VePart(S) U**= U, UVU*®=S, UA U*=4¢, 
USV=>UFtZPV*. 
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(d) S may be taken to be the set of hyperreal numbers in a model of 
Robinson’s nonstandard analysis (see, e.g. Bell and Machover [1977]) and + 
as the relation of infinitesimal nearness. In this case ~ is transitive. 

(e) S may be taken to be the affine line in a model of synthetic differential 
geometry (see Kock [1981]). In this case there exist many square-zero 
infinitesimals in S, 7.e. elements £ Æ o such that 6? = o, and we take x & y to 
mean that the difference x—y is such an infinitesimal, i.e. (x—y)? = o. 
Unlike the situation in case (d), the relation ~ here is not generally 
transitive. 


Given a space S = (S, L) we define an interpretation of the language ¥ of 
attributes to be an assignment, to each primitive attribute A (i.e. atomic 
formula of Z) of a part [A], of S. Then we can extend the assignment of 
parts of S to all attributes recursively as in (2.1). 

Let us call a formula A S-validif [A], = S. If æ is a class of spaces, we say 
that A is &-valid if it is S-valid for all Se æ. The purpose of introducing 
this concept of validity is that it enables us to characterise the tautological 
statements (truths) of various logical systems. Let Jo A, Dis and Moz be the 
classes of topological spaces, discrete spaces and proximity spaces, respec- 
tively. It is well known (cf. Rasiowa and Sikorski [1963], ch. IX, §3) that the 
Jo p-valid formulas of L coincide with the tautologtes of intuitionistic logic in 
£, and (tbid., ch. VII, §1) the Géa-valid formulas with the tautologies of 
classical logic. Now, as we have observed, the lattices of parts of proximity 
spaces include the lattices associated with Birkhoff and von Neumann’s 
‘quantum logic’. So it is natural to identify the Aow-valid formulas (of L) as 
the tautologies of quantum logic (in L). 

Let us write I, K, Q for the sets of tautologies of intuitionistic, classical, 
and quantum logic, respectively. Clearly we have the relation 


IVUQEK. 
Moreover, we have 


QcI, IcQ, IUQ#K 
k 4% 


since, for formulas A, B, 


AV—AeQ-I (2.4) 
[A A-(AA B) Am (4 A-B)]eI-Q (2.5) 
4V (4AB) V (4 Am B)e K-(1vUQ). (2.6) 


To prove (2.4), we note that A V — 4 € Q is an immediate consequence of 
the evident fact that UU U* = S for any part U of a proximity space S 
(where U® is defined in (2.3)). That A V —4 ¢{ is, of course, well-known. 

For (2.5), the formula C on the left-hand side is evidently a classical 
tautology and contains no connectives except A and —. So by a well-known 
result of Gödel (ibid., ch. IX, §5) C is an intuitionistic tautology and hence 
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Cel. To show that C¢ Q, it is enough to construct a proximity space S and 
an interpretation of primitive formulas A, B in S for which [C], # S. To 
this end let S be the set {o, 1, 2, 3}; define the relation ~ on S by 


mane|m—n| # 2. 


Clearly % is a proximity relation on S. Define an interpretation of A, B in 
the resulting proximity space S by [A], = Qo, [B]s = Q, (recalling the 
definition of Q, given in (2.2)). It is then easily verified that 


O§ = Qn, Of =Qs, Qo AQ; = Qo A OF = 6. 


Consequently, 


[C]s = m Als A I (AA B)]s A [5 (AA — B)]s 
=OFASAS=Q,#S. 


The result follows. 

As for (2.6), the formula D on the left-hand side is evidently a classical 
tautology. It cannot, on the other hand, be an intuitionistic tautology since, 
if it were, by taking A to be itself an intuitionistic tautology, it would follow 
that B V — Bis an intuitionistic tautology, which as we know is not the case. 
To see, finally, that D ¢ Q, one uses the proximity space S defined above and 
verifies that 


[D]s =[C]s # S. 


Thus quantum logic (as we have defined it) may be distinguished from 
classical (and intuitionistic) logic by the assertion that the formula displayed 
in (2.5)—a weak, if recherché, version of the distributive law—is a tautology 
of the latter systems but not of the former. But this, it seems to me, is a 
technical and somewhat opaque method of drawing the distinction: in the 
next section we show how to formulate it in a more striking and intuitively 
convincing way. 


3 THE MANIFESTATION OF ATTRIBUTES 


Given a space S and an interpretation of the language of attributes ¥ in S, an 
attribute A and a part U of S, it is natural to consider the relation U & [A ]g 
as meaning that the part U is covered by the attribute 4. Now for topological 
(and discrete) spaces there is another way of obtaining the covering relation, 
which is reminiscent of the definition of set-theoretic forcing. Namely, we 
define the relation U -g A, which shall be read U manifests A in S, by 
recursion on the number of logical symbols in A as follows: 


Ut-sA<USC|[A], for primitive A 
UttsAA Be UL sA&ULB 
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Ult-sAVBoVit-sA & WihsB for some parts V,W of S such that 
U=VuWw 


U}-s—A+>[Vs4=V¢ U*] for all parts V ofS. 


Thus U manifests a disjunction A V B provided there is ‘covering’ of U by 
two ‘subparts’ manifesting A and B respectively, and U manifests a negation 
— A provided any part of S manifesting A is included in the ‘complement’ of 
U. 

Now it is easily shown by induction on the number of logical symbols in 
formulas that for topological (and discrete) spaces S, 


UtsAoUS [A] (3.1) 


That is, for topological (and discrete) spaces, the covering relation and the 
manifestation relation coincide. However, as we shall see, for proximity spaces 
thts ts no longer the case. And, as we show presently, it is the manifestation 
relation which is of real interest in this situation. 

The coincidence of the manifestation and covering relations for topolo- 
gical spaces has the following immediate consequence. Defining a space S to 
support an attribute (formula) A if S +g A (which we shall abbreviate simply 
to kg A), then the tautologies of intuitionistic (resp. classical) logic are those 
formulas which are supported by every topological (resp. discrete) space. At the 
time of writing it is not known whether this result extends to quantum logic, 
i.e. whether the tautologies of quantum logic coincide with the formulas 
which are supported by every proximity space. (The claim in Bell [1983] 
that this is the case was based on a result (Theorem 2.4 of that paper) which 
has turned out to be false.) However, it can be shown that, for example, the 
quantum-logical tuatology A V — A is supported by every proximity space 
(as are, additionally, all quantum-logical tautologies not containing ‘ V’). 

Let us call an attribute A S-persistent (or persistent over S) if for all parts 
U,V ofS 


Ve U& U}-sA>VigA. 


(Note that a primitive attribute is always S-persistent. More generally, it is 
not hard to show that the same is true for any attribute A not containing 
occurrences of the disjunction symbol V .) And let us call a space S persistent 
if every attribute is S-persistent (for any interpretation of # in S). By (3.1), 
every topological (or discrete) space is persistent, so in particular the 
tautologies of intuitionistic or classical logic are persistent over their 
associated spaces (topological or discrete, respectively). As we now show, in 
striking contrast, there are tautologies of quantum logic which are not 
persistent over their associated spaces, viz., proximity spaces. This is 
revealed by the following simple example of a non-persistent proximity space. 

Consider the real line R with the proximity relation: x = y+|x—y| <4 
and let R be the associated proximity space. The quantum at a point xe R is 
then the closed interval of length / centred on x. Suppose now we are given 
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two primitive attributes B (‘black’) and R (‘red’). Define interpretations of 
Band RinR by 


[Bla = v {[en, ant 1]: neZ} 
[R]e = Vv {[2n— 1, an]: neZ}. 


(Here Z is the set of all positive and negative integers and [a, b] is the closed 
interval with endpoints a, b.) To put it more vividly, we ‘colour’ successive 
unit segments of R alternately black and red. Clearly, then, R supports the 
disjunction R V B. But if U is the quantum Q, = [4,3] then R V B is not 
manifested over U, since U is evidently not covered by two subparts over 
which R and B are manifested, respectively (indeed, U has no proper 
subparts). Equally clearly, U does not manifest the quantum-logical 
tautology R V — R (nor, of course, B V — B). 

Thus arises the curious phenomenon that, although we can see, by 
surveying (a sufficiently large part of) the whole space R, that the part U is 
covered by redness and blackness, nonetheless V—unlike R—does not split 
into a red part and a black part. In some sense redness and blackness are 
conjoined or superposed in U: it seems natural then to say that U manifests a 
superposition of these attributes rather than a disjunction. 

This concept of superposition of attributes turns out to admit a very 
simple rigorous formulation. In the example we have just considered, the 
part U manifests a superposition of the attributes R and B just when there is 
a part V of the space which includes U and manifests R V B (in this case, V 
may be taken to be the whole space R). Now this inevitably prompts the 
following definition. Given a proximity space S, an interpretation of £ in S 
and attributes A, B, we say that a part U of S manifests a superposition of A 
and B if there is a part V of S such that U & Vand V l-g A V B. Now for any 
attribute C, it is readily shown that 


IV2U'Vi-gC+Ut s—-C. 
(Consequently, —;— C is persistent.) So the condition that U manifest a 
superposition of A and B is just 
Ut-s——(A V B). 


It follows that superpositions are double negations of disjunctions. We shall 
have more to say about superpositions in the sequel; in particular in the final 
section we shall see how this concept of superposition relates to the usual 
quantum-mechanical notion. 
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To conclude this section, we show how our space R can be enriched so as 
to furnish an interpretation of the quantum-mechanical notion of incom- 
patible attributes. To this end, suppose that, in addition to the two primitive 
‘colour’ attributes R and B, we are given two primitive ‘charge’ attributes + 
and —. Write Colour for the disjunction R V B and Charge for the 
disjunction + V —. Interpret + and — in R by 


[+h =Y [e= stj nez} 
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Now clearly R supports Colour A Charge. But since, in R, 
[+] A [R] = [+14 [8]=[-]^A[R]=[-} A [B] 
=ġ, 


we have 


Fem AR) A—(+ AB)A(— AR)A—(— AB). 


In other words, despite the fact that the whole space R manifests both Charge 
and Colour, there is no non-empty part of the space which manifests both a 
specific charge and a specific colour. This situation is sufficiently similar to 
the familiar incompatibility of position and momentum measurements in 
quantum mechanics (‘any particle has both a position and a momentum, but 
not a specific position and a specific momentum’: cf. Putnam [1969]) to 
justify calling Colour and Charge incompatible attributes (over R). We shall 
have more to say about incompatibility once we have introduced the 
implication operation, a task we turn to in the next section. 


4 INTRODUCING IMPLICATION 


So far we have scrupulously avoided considering what is, in classical and 
intuitionistic logic, a logical operation of fundamental importance, vig., 
implication. We shall now remedy this by expanding our language of 
attributes Y to include the implication symbol —. 

When S is a topological or discrete space, its lattice L of parts has a naturally 
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defined ‘implication’ operation — defined on it by 
U — V = largest open set included in V u (S— U) 
(=V u (S — U) when S is discrete). 


So in either case U — V is the largest part of S whose intersection with U is 
included in V. We can extend the interpretation in S of formulas of & to 
include implication formulas A — B by the rule 


[4 > B]s = [A], > [B]s. 


And it can then be readily shown that, if we extend the notion of validity to 
implication formulas in the obvious way, the Jo A- (respectively Dés-) valid 
formulas (now also involving ‘->’) continue to coincide with the intui- 
tionistic (respectively, classical) tautologies. 

In the case of proximity spaces, however, there is no entirely satisfactory 
way of defining the operation —> on the lattice of parts, and so no evident way 
of interpreting ‘>’. (This, it may be said, is the source of the vexatious 
question of the meaning of ‘—’ in quantum logic.) However, we can 
overcome this difficulty by extending the manifestation relation to impli- 
cation formulas as follows. For any space S, we define 


U} A => B<>VV SUV Hg A =V bg BI. 


For topological and discrete spaces S, one can show that (3.1) continues to 
hold for any formulas, now including those involving ‘—’, and so, again, the 
tautologies of intuitionistic (or classical logic) coincide with the formulas 
supported by every topological (or discrete) space. (Here the applicability of 
the term ‘support’ has been extended to include implication formulas.) 

The introduction of — into ¥ leads to simple and striking characterisa- 
tions of the difference between classical and quantum logic. Let us identify 
the tautologies of what I shall term tmplicative quantum logic as those 
formulas, (now involving ‘-»’) supported by every proximity space. Now, 
one easily shows that for any space ¥ an attribute A is S -persistent if and 
only if, for any attribute B, F supports the formula A — (B > A). Since, as 
we have seen, attributes are not generally persistent over proximity spaces, it 
follows that the formula A = (B => A) is not a tautology of implicative 
quantum logic. This is consonant with the views of Mittelstaedt (cf., e.g. 
Jammer [1974]) who regards the non-provability of A — (B —> A) as being 
characteristic of the difference between quantum and classical logic. 

It is natural at this point to introduce the relation of entatlment among 
formulas. If @ is any class of spaces, we say that a sequence A,,..., A, (with 
n > 1) of formulas @-entails a formula B, and write 


A,,...,4,- ¢B 
if, for any SE@ we have 
Irs 4, + (42>: + (A, > B)...). 
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We extend this notion of entailment to the case of the empty sequence of 
formulas by agreeing that 


keyB<>1.B foreverySe@. 
When @ is Jof or Bes this definition of entailment is the familiar one: 
A,,... AE rB IHA A. A A) >B 
is an intuitionistic tautology 


(4.1) 
Ais.. os An Fau BHA AN AA) >B 


is a classical tautology 


Thatis, A,,...,4,-9,,Bifand only if A,,..., A, intuitionistically entails 
B and A,,...,4,-9,,B if and only if A,,...,A, classically entails B. 
Analogously, it is natural to say that A,,..., A, implicative quantum logically 
entails B and write 


Ay... A FoB 


when A,,...,4,F 9ra B. 

Implicative quantum-logical entailment has the curious feature, not 
shared by classical or intuitionistic entailment, that the order of the premises 
A,,...,A, must be taken into account. (Consequently, in particular, there is 
no analogy to (4.1) for Fg.) For instance, although it is evidently the case 
that 

A,BEoB, 


it is not generally the case that 
B, AF oB. 


(To see this, take A = Colour and B = Charge in the example at the end 
of §3.) It therefore seems appropriate to say, adapting a phrase of Saul 
Bellow’s, that in quantum logic the postulates have a tendency to decay 
before the end of the argument! 

Observe also that the rule of introduction of premises on the left—valid 
for classical and intuitionistic logic—fazls for  g. For instance it is certainly 
the case that 


FoB V B, 
but not generally that 
AFgBV—B. 


Indeed, if A and B are primitive attributes, then it is never the case that 
AF QB V —B. To establish this, return to the space S used to verify (2.5). 
It is easy to see that, with the interpretations of A and B given there, we have 
QoltsB VY —B, and hence that [+s 4>(BV—), giving AloB 
V—-B. 


mae 
harap 


‘ A 
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This leads once again to the idea of (in)compattbtlity. Let us say that t two 
formulas (attributes) A, B are compatible if 


AFgBV—B and BFgAV—A 
and coassertable if 


A,BkgA and B,AkgB 


Compatibility of A and B means that the introduction of the premise A does 
not affect the assertability of B V — B (or mutatis mutandis). Coassertability 
of A and B means that, given the premise A, introducing the premise B does 
not affect the assertability of A (or vice-versa); in other words, A and B are 
simultaneously assertable. 

It is easily shown that compatibility implies coassertability. However, the 
converse is false since all primitive formulas are evidently coassertable but, 
as we have shown above, incompatible. Note that it follows from this last fact 
that A and B V — B are not coassertable for any primitive A, B. 

We infer that implicative quantum logic is distinguished from in- 
tuitiontstic (and, indeed, classical) logic by the presence of non-coassertable 
formulas, and from classical logic by the presence of incompatible formulas. 

The concept of quantum-logical entailment also yields a precise formu- 
lation of a general notion of superposition of attributes. Given a space S, let 
us say that an attribute A is a superposition of two attributes B,C over S 
provided that, for any part U of S, if U manifests A, then U manifests a 
superposition of B and C in the sense of §3. This condition is easily seen to be 
equivalent to: 


Hs >~ (B VC). (4.2) 


We say that A is a (quantum-logical) superposition of B and C if (4.2) holds 
for every proximity space S, t.e. if 


FeA>~—— (BVO, 


or in other words if 


Ako4—(BV ©). (4-3) 


In the classical case, of course, we would be allowed to infer from (4.3) that 
A} B V C; butin the implicative quantum-logical context we cannot do so. 
This follows from the evident fact that for any attributes A, B, A is a 
superposition of B and —B, but if they are both primitive, A is, as we have 
seen, incompatible with B. Thus implicative quantum logic is distinguished 
from classical logic by the presence of superpositions which are not reducible to 
disjunctions. 

Despite the non-classical properties of = o, we observe that the classically 
valid law 


A,—AV BEB (4.4) 
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still holds. And as an immediate consequence, the weaker ‘orthomodular 
law’ (cf. Goldblatt [1974]) 


A, AV(AAB)EoB 


also holds. (‘To establish (4.4) let S be any proximity space and let U, V be 
parts of S with V S U. Suppose that U }s4 and V|-s—A V B. Then 
there are parts W, Z of S such that WU Z = V, Wi-g—A and Z s B. But 
since WSU, Ut gA and Wi-g— 4 jointly imply W = ġ. Hence V = Z 
and V -s B. This gives Hs A > [A V B) > B] and (4.4) follows.) That 
(4.4) holds is perhaps surprising since it is easily shown that the classical law 
(modus ponens) governing implication 


A,A>BEB 


fails for Fg. (To see this, take 4 = B = Colour in the example at the end of 
§3.) However, we note that implication still satisfies the fundamental 
deduction theorem as a trivial consequence of the definition of | 9: 


A,,..., A, -g Bo Ay,...,A4y-1F- 94,7 B. (4-5) 


It is tempting to conjecture that the implicative quantum-logical entail- 
ment relation is axtomatisable. That is, one should be able to specify a 
‘quantum-logical provability relation’ |— g based on a set of formal axioms 
and rules of inference and then proceed to show that 


Ar -An Eo B<?Ay,-.., An QB. 


(As axioms and rules one would presumably include correct assertions such 
as (4.4) and (4.5).) The logical calculus based on f-o would then be, in my 
view, a promising candidate for the role of formal quantum logic. So far, 
however, I have not succeeded in carrying this out and it remains an open 
problem. Nevertheless, the fact that the quantum-logical entailment 
relation is definable in a way similar to that for classical and intuitionistic 
logic, and satisfies the deduction theorem, suggests that, from a semantical 
standpoint at least, implicative quantum logic is a geniune logical system 
and not merely an algebraic formalism. 


5 SUPERPOSITION OF STATES 


In this final section we relate the concept of superposition of attributes to the 
quantum-mechanical notion of superposition of states. 

We may regard a discrete space as being essentially the same as a classical 
phase space (cf. §1). In such a space S, a state may be identified with a one- 
point subset of S, i.e. a minimal non-empty part of S. If every such part is the 
value in S of a primitive attribute, then we may identify states of S with 
minimal primitive attributes over S, i.e. primitive attributes A such that, for 
any part U of S, 

i U} }s4U=¢ or U= [4] (5.1) 


G 
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We shall retain this definition of state when S ts an arbitrary proximity space. 
Given a proximity space S, and states A, B, C of S, we recall that A isa 
superposition of B and C over S if 


If we agree to identify two states A and B whenever 
H-s(4— B) A (B>A), 


it is then readily shown that some of the most important of Dirac’s rules 
governing superpositions ([1936], chapter 1) are satisfied, e.g. 


@ The result of superposing any state with itself is the same as the original 
state. 

@ For any states B, C, both are superpositions of B and C. 

@ Superposition is independent of order. 

@ Each pair of states has (in general) many different superpositions. 


To complete the picture, consider finally the ‘orthodox’ quantum- 
mechanical framework based on a Hilbert space H. Here the associated 
proximity structure is (H—{o}, +) where œ% is the relation of non- 
orthogonality of vectors. For each c #0 in H we introduce a primitive 
attribute A, and interpret A, in the resulting proximity space H by setting 


[4,] = QO. = {y # 0: x = y}. 


Then the A, are the minimal primitive attributes over H. Moreover, we 
identify A, and A, precisely when Q, = Q,, which is easily seen to be 
equivalent to: x is in the one-dimensional subspace of H generated by y. In 
other words, the (identified) minimal attributes over H—the states of H in 
the above sense—correspond to the one-dimensional subspaces of H, i.e. to 
the states of H in the usual quantum-mechanical sense. And lastly, it is easy 
to show that A, is a superposition of A, and A, in our sense, i.e. 


tA, >m, V A,) 


if and only if Q, © Q,UQ,, which is in turn equivalent to ‘x is in the 
subspace spanned by y and 2’, ie. ‘state x is a quanturn-mechanical 
superposition of states y and 2’. 

We conclude that the concept of superposition of minimal attributes is the 
correct extension of the quantum-mechanical concept of superposition to 
our more general framework. 

Concluding Remark. Here we have only dealt with proposttional logic. But 
since all the lattices involved are complete, it is not difficult to extend the 
framework to accommodate predicate logic (cf. Bell [1983]). As far as I can 
determine, however, no fundamentally new features emerge. 


London School of Economics 
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Discussions 


MORE ON THE RELATIONSHIP BETWEEN TECHNICALLY GOOD 
AND CONCEPTUALLY IMPORTANT EXPERIMENTS: 
A CASE STUDY! 


x In a discussion paper entitled ‘What makes a Good Experiment?’ 
Franklin [1981] introduces a distinction between experiments that are 
technically good and those that are conceptually important. In the former 
class are experiments which result in more accurate and precise measure- 
ments of physical quantities due to improvements in existing apparatus or 
the construction of an entirely new apparatus. Conceptually important 
experiments are classified according to both the nature of their relationship 
and their relevance to existing theory. Examples of these are the so-called 
‘crucial experiments’ as well as those that are strongly corroborative with 
respect to the core of some already existing theory; an instance of the former 
being the 1957 experiments of Friedman, Garwin and Wu establishing the 
nonconservation of parity and the latter the 1927 experiment of Davison and 
Germer, a result that corroborated deBroglie’s views regarding the wave 
nature of the electron. Franklin further points out that an experiment 
labelled as conceptually important can also be classified as technically good, 
as was the case with the Millikan oil drop experiment designed to measure 
the charge of the electron. 

A further aspect of the relationship between technically good and 
conceptually important experiments was pointed out by Lai [1984] in a brief 
discussion of the philosophical relevance of technically good experiments. 
As Lai has noted, a great deal of ingenuity and creativity is required not only 
in the ‘invention’ of theories but in the design of experiments as well. Indeed 
there is a sense in which conceptual importance and technical sophistication 
go hand in hand, especially with respect to the kind of expertise required in 
conducting a technically good experiment. 

The object of this commentary is to focus attention on yet another facet of 
the relationship between technically good and conceptually important 
experiments; one that is often taken for granted in the context of 
philosophical discussions that surround the interplay between theory and 
experiment. The situation I am referring to is the role played by technically 
good experiments in elevating the status of experimental evidence that had 
been hitherto ignored, to the class of results deemed ‘conceptually 
important’. 


t I would like to thank Allan Franklin for valuable comments and suggestions. Support of 
research by Social Sciences and Humanities Research Council of Canada is gratefully 
acknowledged. 
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At first mention this may seem like a rather obvious and uninteresting 
claim. The fact that we rely on technically good experiments when deciding 
between rival research traditions, as in the case of the ‘crucial experiment’, is 
hardly a novel point about the relationship between theory and experiment. 
It is also true that before acknowledging a particular experimental result as a 
worthy opponent for the currently held theory we want to be at least 
reasonably sure that the findings are not the result of experimental artifact; 
so of course we require technically good experiments in the evaluation of 
experimental evidence. In most cases we view the new experimental results 
as possible competition for the existing theoretical account and on that basis 
actively seek sophisticated experimental techniques which will confirm our 
hypothesis, or we attempt to incorporate the results into the extant 
theoretical structure. Although this process seems relatively straightfor- 
ward there are instances where the link between technical expertise and 
conceptual importance is not as perspicuous as one might otherwise think. 
The example I want to consider is the 1947 experiment of Lamb and 
Retherford, a result which prompted correction of the well-confirmed Dirac 
equations that represented the fine structure of the hydrogen atom. The 
curious point about this particular experimental discovery is that an 
anomaly was detected in the theory as early as 1933; an anomaly that 
convinced many theorists that the Dirac formula did not give a rigorous 
representation of the fine structure of hydrogen. Despite these findings no 
theoretical change resulted until 1947. A brief discussion of the case may 
provide some clues as to why this should be so. But first, in order to fully 
appreciate the details of the case, some historical stage~setting is in order. 


2 According to the Bohr theory [1913] the second energy level (n = 2) ina 
hydrogen atom consisted of two orbits; a circular one (level 2S) and an 
elliptical one (level 2P), both with the same energy. Sommerfeld’s 
relativistic corrections [1916] changed the representation of the energy 
levels slightly and separated the 2S from the 2P level. With the advent of 
non-relativistic quantum mechanics the orbits given by the Bohr- 
Sommerfeld quantum conditions were replaced by stationary states whose 
statistical properties were described by wave functions, wW,),,, which were 
admissible solutions to the Schrodinger wave equation. The states were 


described by the principle quantum number n = 1, 2, 3,..., an orbital 
angular momentum quantum number / = 0,1,2,...,m-—1, and a magnetic 
quantum number m = —/, —/+1,..., +l. The energy levels for a Coulomb 


field agreed exactly with those of the Bohr theory and depended only on n. 
However, when wave mechanics was modified to include relativistic mass 
variation the splitting of the energy levels n = 2, l = 1, o was 8/3 the value 
obtained by Sommerfeld. It was only after the effects of electron spin and 
magnetic moment (introduced by Uhlenbeck and Goudsmit in 1925) with 
the spin-orbit interaction that the fine structure separation turned out to 
have the Sommerfeld value. The 2S and 2P levels had been separated into 


Technically Good and Conceptually Important Experiments 103 


three levels, 25; )2, 2P 1/2, and 2P3,2; but if one includes relativity the first two 
coincide leaving us with only two distinct levels. Thus the 27P3,. and 27P,/2 
states differed in energy (because of spin-orbit interaction) by the amount 
predicted by the Sommerfeld formula. Unfortunately the calculation of the 
theoretical intensities for Balmer components (lines in the visible region of 
the spectrum) were not found to be in agreement with observation. In 
retrospect this is not surprising, the derivation of the intensities and the 
selection rules were unsatisfactory because the spin and relativistic correc- 
tions were added to the quantum mechanical equations only as an after 
thought. 

It was Dirac’s 1929 development of the theory of the magnetic electron 
along with the relativistic wave equation that marked the first successful 
attempt at combining the spin hypothesis with the principles of the new 
wave mechanics. He had been unhappy with the solutions of Pauli and 
Darwin! and had sought to reformulate the relativistic energy equation in 
the presence of the electromagnetic field. He succeeded in proving that the 
electron spin arises as a necessary result of the relativistic formulation of 
quantum mechanics; and, in contrast to the relativistically corrected theory 
of Schrödinger and Pauli, Dirac’s wave equations were truly invariant for 
Lorentz transformations. A further virtue of Dirac’s theory was that not 
only the spin quantum-number (S = 4) but also the value of the magnetic 
moment is derived without any specific assumptions apart from the values of 
mand e. Finally, on applying this revised wave theory to the hydrogen atom 
we obtain the same formula that was proposed by Sommerfeld in 1916—a 
formula which reproduces the values of the hydrogen terms as well as those 
for the optical spectra of helium, and those in the x-ray region, with 
unprecedented accuracy. 

One further point of importance is that Dirac’s relativistic wave equation 
for the hydrogen atom coincided with the one obtained previously by 
Darwin. As it turns out Darwin’s a priort assumptions concerning polarised 
waves were equivalent to postulating electron spin; the difference being that 
Dirac’s theory rested on a solid theoretical foundation as opposed to ad hac 
hypotheses introduced for reasons of expediency. 


1 The first attempt to devise a wave equation that would be consistent with electron spin was 
undertaken by Pauli who introduced spin matrices into the classical wave equation. These 
matrices were intended to split the wave equation into a system of two simultaneous equations 
so that the solutions of the system appeared in pairs. The two solutions of a pair were the 
representations of an electron spinning in one sense or the other. But, the problem remained 
of how to provide a wave representation of electron spin since it pertained chiefly to the 
particle picture. Pauli’s treatment suggested that the amplitude w of the wave must be defined 
by two components instead of being scalar. Since this is the condition required for the 
representation of polarised waves we can assume that the wave representation of the spinning 
electron is a polarised wave. Darwin had associated the deBroglie waves with polarisation and 
on this basis was led to a wave equation which furnished the correct fine structure of the 
hydrogen spectrum. However, both these accounts are open to the objection that they 
introduce spin or the polarisation of the waves in an artificial way without justifying their 
introduction by theoretical argument. 


104 Margaret Morrison 





4 ) (0.2) 
Pt 108 ete 2205 ——< 136 108 — 
fa 2 —Hrequency 14 5 
Figure 1. 


The figure above shows us the fine structure levels belonging to n = 2 and 
n = 3 of H, (D) according to the Dirac theory. H, is the strong red Balmer 
line of hydrogen (6563 A), its corresponding analogue being D, in the 
heavier isotope deuterium. The capital letters S, P, D signify the values of 
l= 0,1,2, .... The numerical subscript e.g. P3/2 denotes the value of j and the 
superscript e.g. 27P indicates the doublet nature of the spectrum—that there 
are two possible values of j for each value of l (except where / = o). All levels 
except the uppermost ones in a set are double since every j smaller than n = 
—4 occurs in combination with two different values of /. The numbers in 
parentheses are the calculated intensities with the component separations 
given at the bottom in intervals of cm” t}. 

In order to avoid any confusion surrounding the nature of the three 
quantum numbers it should perhaps be pointed out that in the attempt to 
formulate a relativistic wave equation prior to 1925 the solution for the H, 
atom is similar to Sommerfeld’s except that (/+4) takes the place of k in 
Sommerfeld’s original equation: 


E = —RZ’|n [1 +072? /n(1/k—3/4n) +] 
n=n +k=1,2,... and k=1,2,...,n. 


Since the quantum number associated with the orbital angular momentum 
in the new theory is integral, the new formula implies different energy levels 
from the old. The further correction incorporating the spin-orbit energy 
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(also calculated relativistically) gave yet another interpretation of the 
quantum numbers. The k in Sommerfeld’s formula is replaced by (j+4), 
where j is the quantum number belonging to total angular momentum, the 
vector sum of orbital and spin angular momentum. (j +4) is integral since j 
itself is half integral; moreover, since / may take any value between o and (n 
— 1), the values of (j+4) run from 1 to n. These are precisely the values 
allowed to k on Sommerfeld’s theory. This newer version predicts the same 
energy levels as the old theory but labels them (,j) rather than (n, R). 
The fact that the orbital quantum number / does not appear in the result 
means that the state [/, 7 = (1+4)] has exactly the same energy as [(J+1),j = 
(1+1)—4]. 

For a considerable period of time Dirac’s relativistic wave equation was 
considered to be one of the few and final achievements in physics. However, 
in 1947 measurements at very high radio frequencies enabled Lamb and 
Retherford to conclusively show that Dirac’s formulae, though very much 
more satisfactory than all previous ones, nevertheless failed to fit the 
observable facts completely. Many spectroscopists had found the separation 
between the main components of the H, line to be slightly smaller that the 
value predicted by theory—specifically, a deviation of 1000 Me/sec. in the 
position of the 27S 1/2 level was detected. Pasternack [1938] had suggested 
that this discrepancy could be explained by assuming the level 2S,,. to be 
about 0.03 cm™* higher than 2P, in in contradiction to Dirac’s theory which 
does not vary the position of the levels and predicts the same energy for both. 
The interesting point to note with respect to this discovery is that similar 
anomalies were reported as early as 1934 yet little or no attention was paid to 
them until the 1947 discovery of Lamb and Retherford. It seems odd that 
this should be the case given that deviations from the Dirac theory were 
noted on at least six separate occasions during the period from 1934 to 1938, 
in contrast to only one confirming instance noted in the literature. So, 
despite numerous instances corroborating the initial discrepancy theorists 
seemed quite satisfied to accept the possibility that the cause might well be 
experimental artifact, or rest content that the discrepancy simply didn’t 
matter. 


3 Thecontroversy began with the 1934 paper of Houston and Hsieh which 
discussed a new method of treating interferometer patterns of doublets in 
the Balmer series. Up until this time the observed positions had generally 
agreed with those expected except for some discrepancy between the 
observed and calculated relative intensities of the two members of each 
doublet. This discrepancy was usually ignored since it was felt that the 
knowledge of the conditions of excitation was insufficient to give an accur- 
ate prediction of the intensities. Because of the satisfactory way in which 
the apparent separations of the hydrogen doublets (i.e. the separa- 
tions of the centres of gravity of the two component groups) seemed to fit the 
theory, Houston and Hsieh hoped to use this separation as a-means of experi- 
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mentally determining the fine structure constant. To obtain the desired 
accuracy in the fine structure constant it would be necessary to measure 
this apparent separation with a precision of the order of o.r per cent. The 
method involved the measurement of the intensities of the minima between 
the members of the doublet and between successive orders of interference. 

The ordinary method of measuring an interferometer pattern requires the 
estimation of the location of some characteristic point on each fringe, usually 
the centre of gravity, either on the original plate or on a microphotometer. 
From the location of these points (the centre of gravity being the one that is 
very easily found) the diameters of the corresponding circles can be 
determined and these give the values of the fractional order of interference at 
the centre of the pattern for each line. In terms of these orders of interference 
the wave-number difference between the two lines is Av = Ap/2d, where Ap 
is the difference between orders of interference of the two lines and d is the 
separation between the surfaces of the interferometer plates. The idea 
behind the procedure of Houston and Hsieh was to find the plate separation 
for which Ap = 0.500, without actually measuring the fractional orders of 
interference. The method then consisted of photographing the interference 
pattern with different values of d and selecting the value which satisfied the 
test. It was possible to approach the required precision on the first five 
members of the series but the results were so far from those to be expected 
that it seemed clear that they had not measured the fine structure constant 
and instead had attained a degree of precision in which the theory was no 
longer satisfactory. 

The method of measurement used by Houston and Hsieh was not directly 
applicable to H, and consequently not much attention was given to it as a 
method for providing reliable values for that line. They did conduct five 
measurements on three plates in the ordinary fashion and obtained A v = 
0.3171 +0.0020. However, by making the required corrections for the 
interferometer separation they calculated the separations for the centers of 
gravity for the H, lines to be 0.3086 or 0.3049. Although this direct 
application was not justified in the case of H, a graphical analysis based on 
the theoretical form of H, indicated that these results needed to be increased 
only by about 1 % to get the correct separation. Hence, they concluded that 
the method of direct visual measurement gave results which were too large, 
at least for the H, line. Their method could be successfully used on Hg and 
H,, the strong blue and violet lines, as well as H, and H,; and in all cases the 
predicted separations were greater than those observed. In the table below 
the numbers in the first row—first column give the calculated ratios of the 
intensity of the red component to the violet component, while the second 
column gives the separation of the centres of gravity in terms of Av/ Ra? 
when Re? is taken as 5.818 (Rydberg constant times the square of the fine 
structure constant). The second row gives the corresponding observed 
quantities and the last row gives the ratio of the observed to the calculated 
separations. 
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Hy H, H; H, 

1.098 0.3440 1.040 0.3534 1.014 0.3583 0.974 0.3598 
1.089 0.3298+0.004 1.041 0.3388+0.004 1.009 0.3451 0.006 1.006 0.3506+0.006 

0.959 0.959 0.963 0.974 


The two sets could have been brought together by using 1/æ = 139.9; but 
this change was so great that it seemed impossible that there should be any 
such error in the constants composing a. As a result Houston and Hsieh 
concluded that the theory was inadequate to explain the observations. 

There existed a few explanations of why this anomalous result occurred. 
One was that the conditions of excitation were such that the relative 
intensities were not those assumed in the table. However, the observed 
intensity ratios of the component groups of doublets were so near to the 
calculated values that it would seem that the intensities were all ap- 
proximately as calculated. Another possibility was that the effect of the 
nuclear magnetic moment was sufficient to distort the structure so that the 
centres of gravity did not have the calculated separation. If the spin moment 
of the nucleus was assumed to be 4 and the magnetic moment is the one 
normally expected given the mass of the nucleus, then the separation of the 
levels into which the 2S term is split is 0.002 cm 1 while the separations at 
all other levels are much smaller. The relation of these two separated levels 
to the original 2S level is such that for transitions to the former state the 
centre of gravity of the line was not in fact displaced. Yet another 
explanation of these results, one that had been pointed out to Houston and 
Hsieh by Bohr and Oppenheimer, was that the calculation of the energy 
levels had been made without including the interaction between the electron 
and the radiation field. The customary procedure was to compute the levels 
without the interaction and then introduce the interaction with the field as a 
perturbation. At the time of the publication it was not known how to 
combine the atom and the field into one system, but nevertheless, it seemed 
clear that when such a combination could be made a relative displacement of 
the levels could be expected. Furthermore, the displacement should be 
of the order of æ times the fine structure separation—just the order of 
magnitude of the effect that was observed. 

A few weeks later on April r results were reported of another fine 
structure analysis on H, done by Williams and Gibbs. A Fabry-Pérot 
interferometer was used to examine the fine structure of H} and H2 
(deuterium) lines. The interval (Av) between the main components of the 
doublet was found to be 0.308 cm™+ for H} and 0.321 cm~‘ for H? as 
compared to the 0.328 cm‘ indicated by theory. The observed results 
correspond to values of 1/« equalling 141.7 and 138.8 respectively. The 
relative intensities of the components as revealed by the analysis were in 
approximate agreement with those predicted by the theory (See Figure 1). 
Prior to this in 1933 Spedding, Shane and Grace had published a study 


108 Margaret Morrison 


showing microphotometer curves which, by their asymmetry, indicated the 
presence of the same components for H? as were revealed in the work of 
Williams and Gibbs. The separation was found to be 0.324 cm! resulting 
in a value for the reciprocal of the fine structure constant 1/a = 138. This 
discrepancy, they thought, was due to low pressure in the discharge tubes. A 
further study by Spedding, Shane and Grace in 1935 indicated that the 
components of H} and H? had the same relative intensities but the observed 
values differed substantially from the theoretical predictions. For instance, 
the intensity of component 3 was calculated at 1.14 whereas their mean value 
for the intensity of the third component was 1.38 for H} and 1.39 for H2. 
The fine structure constant was determined at 1/¢ = 137.4+0.2. However, 
contra the results of Williams and Gibbs, Spedding et al. found no 
appreciable difference in the doublet separations for H} and H?.4 

Another study on the structure of hydrogen and deuterium done in March 
1937 by W. V. Houston also confirmed the discrepancies between theoreti- 
cal predictions and observed values. The aim of this exercise was to express 
the interferometer pattern produced by spectral lines as a Fourier series 
with the coefficients regarded as the quantities to be measured. The 
coefficients were to be computed in terms of position intensities and 
parameters describing the shapes of the lines in order that the observations 
could be compared with the spectroscopic theory. The main reason for 
employing this new method was to overcome the difficulty in analysing the 
fine structure pattern in light elements; a difficulty caused by the fact that 
the widths of the component lines were of the same order of magnitude as 
their separations, making the object under consideration a continuous 
spectrum whose intensity distribution must be analysed rather than a 
spectrum of discrete lines. There are several reasons why the breadth of the 
component lines (both inherent and resolved) are >o. The first is the natural 
width due to finite lifetimes of the states involved or classically due to the 
damping of the oscillations. In this case we have a line whose shape is: 


I,(8—) = B/{(@—O5)? + B}?, 


where @ is the position of the maximum and f is a constant which depends 
upon the lifetimes of the initial and final states. The broadening due to 
collisions gives a line of the same shape as the one described above and as a 
result these two effects can be treated together by giving $ a suitable value. 
In hydrogen the most important source of broadening is the Doppler effect 


1 Some confusion surrounds the interpretation of Spedding’s [1935] results for the separation 
of the main components. The other participants in the debate, especially Williams [1938], cite 
Spedding’s values as being 0.314 for the doublet interval in H, and 0.318 cm™? for D, (cf. 
Pasternack [1938]). Even Drinkwater et al. [1940] who argue in favour of Dirac’s predictions 
cite Spedding, Shane and Grace as contributing to the group of anomalous results. In 
examining Spedding’s results we can see that the mean value for the doublet separation 
between the main components of H} and H? is approximately 0.327 as compared with the 
theoretical value of 0.328 cm™?. No further papers by Spedding et al. are noted by the authors 
who report their results as anomalous. The sole reference to their [1935] as a confirmation of 
Dirac’s results for component intervals is found in Series [1957]. 
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caused by the motion of the emitting atoms. Another source is the finite 
resolving power of the interferometer which results in an apparent line form 
dependent on.the reflecting power (r) of the interferometer plates. 

A continuous spectrum requires a different method of description and 
hence must be characterised by a different set of numbers than a two line 
spectrum. When an interferometer is used the periodicity of the pattern 
suggests the use of a Fourier series with the coefficients regarded as 
characteristics of the spectrum. However, the basic problem with this 
method was that the theoretical quantities were not always satisfactorily 
known, making it necessary to adopt a criterion for deciding what 
constitutes the best set of parameters of the component lines and then to find 
them essentially by trial. In other words, although the equations could not 
be specifically solved for the parameters describing the shapes of the lines in 
terms of the Fourier coefficient, numerical means could be used to find the 
parameters that best reproduced the observed values. It was this method 
that was applied to the two plates of H, and D,. In Houston’s analysis the 
parameter 8 was neglected and the natural width of most of the component 
lines was dominated by the width of the 2P level, giving f a value of 0.1. If all 
the component lines had the same value of $ the effect on the Fourier 
coefficients could be produced by a reduction in the reflection coefficient (r). 
However, portions of the components designated as 2 and 3 were due to 
transitions to the 2S level. If this level had been isolated it would have been 
found to be metastable and very narrow. Since it coincided with the 2P,). 
level it was subject to a great deal of perturbation and was probably 
broadened. For that reason, as well as to avoid the introduction of more 
parameters than would be justified by the number of usable Fourier 
coefficients, Houston treated all of the lines as having the same natural 
breadth; a factor which was included in the parameter (r). 

The values of Av, which represent the wave number difference between 
the components designated by the subscripts differed by about 1% between 
the two plates. In one instance a result of 0.320 cm™ 1 was given for both H, 
and D, while the other plate gave a result of 0.317 cm” +; both definitely less 
than the 0.328 cm ` + predicted by the theory. The results indicated that to fit 
the observations with the four strongest lines given by the theory it was 
necessary to make the separations of the two most intense components some 
2% less than the theoretical value; once again confirming the conclusions of 
Houston and Hsieh in 1934 as well as those of Williams and Gibbs. - 

Again in October 1938 R. C. Williams reported that within the limits of 
error of the observations the positions of the fine structure components were 
found to be invariant with changing discharge conditions. This result 
further confirmed the previously reported anomalies and ruled out specu- 
lation that it was discharged conditions that caused the discrepancies with 
theoretical values. In Williams’ study the fine structures of both H, and D, 
were observed under varying conditions of excitation in a discharge tube 
filled with pure hydrogen and pure deuterium. The interference patterns 
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obtained on the photographic plates were analysed with a microphotometer 
and analyses were made of the intensities and positions of the fine structure 
components. The average interval between the two main components of the 
complex was found to be 0.319 0.005 cm™? for H, and 0.321 0.003 cm”! 
for D,. The second and third most intense components were optically 
resolved from each other in D, but not in H,. The average interval between 
these components was found to be 0.130 +0.003 cm” !. According to theory 
these intervals are respectively 0.328 cm~+ and 0.108 cm“ !—a deviation of 
about 20% from the observed values. With respect to the interval between 
components 1 and 2 the H, interval seemed consistently about 0.002 cm™! 
less than the D, interval with either one being about 0.009 cm™‘ to 
o.orr cm? less than the theoretical prediction. One of the most important 
aspects of Williams’ investigation was the discovery that component 3 was 
considerably removed from its theoretical position and its intensity was far 
greater than predicted. Up until this time component 3 was never resolved 
from component 2 and it was assumed to be in its theoretically predicted 
position. These findings would ultimately result in a difference between 
corrections based on the theoretical position and intensity of the component 
and those based on the observed position and intensity. A further discovery 
was that the relative intensities of the components varied greatly from 
theoretical predictions. The mean value for intensities relative to theory was 
0.77 for component 1 and 1.69 for component 3 of H,, component 1 of D, 
was 0.82 and component 3 was 2.1. According to Williams’ calculations 
component 1 could be either more or less intense than component 2 but 
never quite reached its theoretical relative intensity. Component 3 had an 
intensity relative to component 2 that was considerably larger than it should 
have been according to theory. These results further confirmed the work of 
previous investigators, especially Spedding, Shane and Grace, whose values 
for intensities varied greatly from theoretical prediction. On the basis of his 
findings Williams concluded that there was little possibility that the 
potential difference across the discharge tube caused a field large enough to 
explain the discrepancies with theoretical predictions. 

Later in December of that same year Pasternack reported results which, 
in retrospect, could be seen as crucial in the development of the theoretical 
corrections required for predicting the doublet separations of H, and D, 
lines. Pasternack noted that upon consideration of the findings of Williams 
[1938] and Houston [1937] both of the reported deviations were consistent 
with a perturbation of the 27S level of deuterium. He calculated the 
displacements of the 27S levels equal to x cm™+ (where x is equal to the 
numerical value of the displacement) and on that basis was able to predict 
that the second component of the line would undergo an apparent 
displacement of approximately 0.3x. (The second component consisted of 
two transitions 27P \,.—37D3)2. and 27S1,.—37P3,2; the former having 2.4 
times the intensity of the latter.) Similar results were obtained for the third 
component, its displacement being 0.9x. Consequently the separation of 
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component 1 and 2 was decreased by an amount of 0.3x, whereas the 
separation of components 3 and 2 was increased by 0.6x. This result agreed 
fairly well with those of Williams [1938] and of Houston and Robinson 
(unpublished) if the displacement was taken to be about 0.03 cm™ 1. And, as 
it turned out, an S level displacement of this magnitude checked quite well 
with discrepancies observed in the doublet separations of other Balmer lines 
of hydrogen. If we look back at the results obtained by Houston and Hsieh 
we can see that the theoretical separations derived by Pasternack (below) on 
the assumption of a 2S level displacement of 0.03 cm”! are in very close 
agreement: 


H,—0.308 H,—0.330 H,—0.339 H,;—0.343 H,—0.345. 


It would seem that a displacement of the S levels should point toward some 
perturbing interaction between the electron and the nucleus; however, an 
interaction of this sort was much too large to be accounted for by the 
assumption of a finite size of electron and proton. 

A study done in 1940 by Drinkwater et al. reported values that appeared 
to support Pasternack’s suggestion of a perturbation of the 27S, j2 level. The 
doublet separation for the two main components of D, was measured at 
0.3205 +0.0004 cm™! which, with correction, resulted in a value of 0.316 
cm +, This was approximately 0.011, cm’! less than the theoretical value. 
The corresponding separation for H, observed at 0.319, yielded a value of 
0.316, cm” t; a result almost identical with the average given by Williams 
[1938] (0.316, cm~ t). The greatest discrepancies occurred in the measure- 
ment of the doublet separation for components 2 and 3. For D, a fairly 
constant value of o.119+0.008 cm“! was obtained, in place of the 
theoretical 0.108 cm~+. The observed half-widths for the latter two 
components were found to be 0.094 and 1.08 cm™? respectively. The 
separation of components 2 and 3 for H, was 0.1314 cm”? with half-widths 
of 0.130 cm 4 and 0.136 cm t+. Despite the fact that the doublet separations 
agreed with the anomalous experimental results Drinkwater et al. concluded 
that no real evidence had been obtained to show that the fine structures 
departed substantially from the values calculated from Dirac’s equations. 
This conclusion was based primarily on the measurement of an increased 
intensity of component 3 (1.3 times its theoretical value) coupled with an 
increased half-width and difficulties with the experimental apparatus; for 
instance the problem of hydrogen contamination in the discharge tubes. 
This was the only extensive study reported in the literature that favoured the 
Dirac results in spite of the experimental findings. A few years prior to this 
in 1937 results were reported in Z. Phys. of an investigation done by M. 
Heyden on D,. She had found the separations of the components to be in 
close agreement with the theoretical values and, for the first time, obtained 
intensity ratios in accordance with theory. However, this was not really seen 
as a confirmation of the theory because Williams [1938] had criticised the 
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results on account of the very large grain of the plates that were used. This 
criticism seemed justified and was later agreed to even by those who denied 
that the new experimental findings provided evidence against the theory, 
that is, people like Drinkwater et al. The criticism was accepted on the 
ground that the actual oscillations in the microphotometer record, pre- 
sumably due to grain effect or dust, had an amplitude equal to a large 
fraction of the intensity of the third component thereby making it difficult to 
determine whether component 3 was resolved. 

Overall the theorists seemed unperturbed by the anomalous findings and 
appeared content to accept the Dirac formula despite its disagreement with 
experimental evidence. The reason for this most probably lies with the fact 
that owing to Doppler broadening of the lines in comparison to the small © 
splitting of the lower n = 2 states, reliable separation of all components of 
the hydrogen line was not successful. As a result it remained vague as to how 
real the shift actually was. Nevertheless, the importance of these anomalous 
results should be emphasized due to the fact that the hydrogen atom was the 
only system for which both the Schrodinger and Dirac equations allowed an 
accurate solution; therefore discrepancies between theory and experiment 
could not be attributed to a bad approximation or inaccuracy in calculation. 
Furthermore, one should keep in mind that Williams [1938] in his studies on 
D, had reduced the Doppler width and was able, for the first time, to resolve 
component 3. But despite this advance in spectroscopic analysis the 
experimental reports of these anomalies were not considered by the theorists 
to be conceptually important. 

This situation persisted until 1947 when Lamb and Retherford, using the 
methods of microwave spectroscopy, were able to detect transitions between 
the very close 25,2 and 2P,)2 levels of an amount corresponding to a 
frequency of about 1000 Me/sec (0.033 cm” +) within an accuracy of 100 
Mce/sec (0.003 cm~ 1). This result, which by the way corroborated the 1938 
hypothesis of Pasternack, was confirmed spectroscopically by the use of 
discharge tubes cooled with liquid hydrogen. The Doppler width was then 
sufficiently reduced to allow the component to be resolved, thereby enabling 
one to measure the term difference (2P 1,2 -25S;)2). The method Lamb and 
Retherford used depended on a novel property of the 27S, j2 level. 
According to the Dirac theory this state exactly coincides in energy with the 
2P 2 state which is the lower of the two P states. The S state, in the absence 
of external electric fields, is metastable (J = o).1 A beam of atoms in the 
metastable state 27S, 2 was produced by bombarding atomic hydrogen. 
These metastable atoms are detectable when they fall on a metal surface and 
eject electrons. If the atoms are subjected to the proper radio frequency 


1 This state is called metastable because decay from it to the ground state (n = 1, l = o) is highly 
inhibited by the A/ selection rule and because all other states lie above it except for then = 2, l 
= 1, j = $ state, which, according to the Dirac theory has exactly the same energy as the 
metastable state. 
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power they will undergo transitions to the non-metastable states 27P, 2 and 
27P 312 and decay to the ground state 17S, j2 Where they are not detected. It 
was in this way that Lamb and Retherford were able to determine that the 
difference between the 27S ,/2 and 27P,/. states was 0.033 cm” ' or about 9% 
of the spin-relativity doublet separation; an energy separation so small that 
the frequency is in the microwave radio range. Further radio-frequency 
measurements established this difference (called the Lamb shift) to a very 
high level of accuracy, 0.3528 cm! for light hydrogen. Finally the 
discrepancies between the observed values and the predictions of the Dirac 
theory had been conclusively verified. ; 

Two weeks after the publication of the Lamb-Retherford results H. A. 
Bethe proposed a theoretical account of the shift in energy levels; it was due 
to the interaction of the electron with the radiation field—a real effect of a 
finite magnitude. What had gone wrong with the Dirac formula? Simply 
this. The mass of the electron as it appears in the equations of Dirac’s theory 
and in the formulae that predict atomic energy levels is taken to be the 
observed mass of the electron moving in the absence of a field or in a very 
weak field. The crucial idea due to Bethe was renormalisation of mass. The 
quantum theory of radiation predicted that a free electron should have an 
infinite mass. This was thought to be due in part to the tendency of an 
electron to emit and reabsorb virtual quanta or equivalently from the 
interaction of the electron with the zero point oscillations of the electro- 
magnetic field. Some explanation had to be found for the fact that the 
observed mass is finite. Bethe however realised that this was outside the 
scope of the then current theory and so chose to ignore electromagnetic mass 
completely. So, for an electron bound in a hydrogen atom an infinite energy 
occurs; but this is merely a manifestation of the infinite electromagnetic 
mass which should be eliminated in some future theory. If the mass terms 
are properly subtracted a finite remainder is obtained—a remainder which 
would be zero for a free electron. In the case of a bound electron the force 
field modifies the effect of the electromagnetic field and a finite displacement 
of the energy levels results. The key to the problem lay within quantum 
electrodynamics, a theory whose physical content was in very close 
agreement with observations. 

It should perhaps be noted that just prior to the publication of the Lamb- 
Retherford results it was shown in the experiments of Nafe, Nelson and 
Rabi (May 1947) that there existed an addition to the magnetic moment of 
the electron. This was detected by analysis of the hyperfine structure of 
hydrogen and deuterium. For a given nuclear spin J there exists two 
hyperfine structure levels separated in energy by Av (expressed in frequency 
units). In hydrogen this splitting is due to an interaction between the 
internal magnetic field produced by the motion of the electron and a spin 
magnetic dipole moment of the nucleus. The hyperfine structure separa- 
tions of atomic hydrogen (vy) and deuterium (vp) were measured directly by 
means of an atomic beam magnetic resonance method. For each atom two 
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resonance lines were measured each at the same value of the magnetic field 
and the vy and wp were evaluated entirely from differences in frequencies. 
Neither the value of the magnetic field nor the g values of the atom entered 
into the final result. As was the case with the examination of the fine 
structure, discrepancies with the theory were noted. The study showed an 
important difference between the measured and calculated values of vy and 
Up of about 26 per cent compared with the probable error of the calculated 
value of about 0.05 per cent. For atomic hydrogen the vy interval was 
calculated at 1416.90 +0.54 Mc whereas the measured value was 1421.3 + 
0.2 Mc. The hyperfine structure for deuterium vp involved a much smaller 
discrepancy, with the calculated value being 326.53-+0.16 Mc and the 
measured value equalling 327.37 0.03 Mc; similarly for the uy/vp ratio, the 
measured value was 4.3416+0.0007 and the calculated value was 
4.3393 0.0014. Overall the difference between calculated and observed 
values was five times greater than the claimed probable error in the natural 
constants. On the basis of their results Nafe et al. concluded that whether the 
failure of theory and experiment to agree was because of some unknown 
factor in the theory of the hydrogen atom or simply an error in the estimate 
of one of the natural constants such as «?, only further experiment could 
decide. It was Schwinger [1948] who finally showed that the magnetic 
moment of the electron undergoes small changes as a result of radiation so 
that it is no longer equal to one Bohr magneton. Thus the Landé g factor in 
the theory of the anomalous Zeeman effect also required correction. 


4 In conclusion then one is led to ask why this anomaly, initially 
discovered in the thirties, had no apparent theoretical impact until the late 
forties? Clearly it was not (at least not explicitly) for lack of theoretical 
footing on the part of the early experimenters for recall that Lamb and 
Retherford provided no theoretical basis for their discovery either. The 
obvious reason for the primacy of the Lamb-Retherford results over the 
others was the technical expertise displayed in that particular study; the use 
of techniques which allowed Lamb and Retherford to sufficiently reduce the 
Doppler width, something that had posed a problem for the earlier 
experimenters in calculating how real the shift actually was. Again the 
curious point here is that despite repeated confirmation in the literature of 
these theoretical anomalies coupled with Williams’ success in resolving 
component 3, no attempts were made to incorporate them into the extant 
theoretical structure. Some speculation was offered, particularly by 
Pasternack, as to the explanation of these findings; but unfortunately this 
resulted in little or no attention from the physics community in general. The 
findings reported in Physical Review were not considered to be conceptually 
important despite their repeated occurrence nor were they seen as providing 
the basis for even a possible correction of the Dirac equations. Their 
conceptual importance emerged solely as a result of the technical quality of 
the Lamb-Retherford experiment. Hence, it was primarily because the 
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experiment was technically good that tt became conceptually important, and asa 
result paved the way for quantum electrodynamics. 


MARGARET MORRISON 
University of Western Ontario 
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NOVELTY AND CONFIRMATION 


In a recent publication Campbell and Vinci ([1984]) have presented a 
lengthy critique of the role of novelty of evidence in confirmation theory. 
This problem was discussed in Redhead [1978] who introduced the 
following criterion for evidence e to support an hypothesis h 


pr(e/~h&b) «1 (1) 
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b' denotes background knowledge in a sense which excludes e being a part of 
pt 

Campbell and Vinci refer to (1) as the Redhead condition,? and then 
proceed to criticise it by means of two counterexamples. In the first case they 
discuss, the Redhead condition is claimed to fail, and yet significant 
confirmation of h results, while in the second case Campbell and Vinci give 
an example where the Redhead condition is satisfied, and yet, it is claimed 
confirmation does not occur. As a result of these arguments the authors 
conclude that the Redhead condition is neither necessary nor sufficient for 
confirmation. Let me first clarify the intended significance of (1) as it was 
introduced in my [1978]. It was supposed to be a criterion for evidence e not 
to be ad hoc in relation to the hypothesis h, where non-ad hocness was 
equated, following Zahar [1973], with heuristic novelty in the sense that e 
did not belong to the problem-situation h was constructed to deal with. Or, 
to put it the other way round, h is an ad hoc explanation of e if A is designed 
specifically to explain e. The condition (1) was supposed to be an explication 
in Bayesian terms of the notion of heuristic novelty. It was not concerned 
with explicating some modified notion of epistemic novelty as Campbell and 
Vinci assume in Section 3 of their paper. In particular b’ is background 
knowledge in the sense of those auxiliary assumptions and initial conditions 
needed to derive e from h and 6’. Worrall in his [1978] has emphasised a 
dangerous confusion concerning the meaning of background knowledge? as 
between the sense just mentioned, and the much more wide-ranging sense of 
everything we hold unproblematic at the time h is proposed, whether it is 
used in the derivation of e or not. I am grateful for this opportunity of 
clarifying what sense of background knowledge I had in mind when writing 
my [1978]. Let us refer to the second sense of background knowledge as 6.* 
Then Campbell and Vinci assume that my 8’ is b less e. In fact on page 321 
Campbell and Vinci introduce my sense of background knowledge, for 
which they use the symbol a, and attribute tentatively to Giere [1979] the 
novelty criterion 


pr(e/~h&a) «1. (2) 


They dismiss (2), which in their notation is, as we have remarked, the ‘real’ 
version of the Redhead condition, because they claim ‘this condition is 
trivially satisfied. The only exception would be where e has a good 
probability of obtaining on the basis of a alone’’.> 

But this is the crucial point of my disagreement with Campbell and Vinci. 


Suppose e is used as a heuristic ingredient in the construction of h in the 


1 We follow the notation of Campbell and Vinci [1984] rather than Redhead [1978] throughout 
this discussion. 

2 Campbell and Vinci [1984], p. 323. 

3 Worrall [1978], n. 6, p. 66. $ 

*4 Campbell and Vinci [1984], p. 321. 

5 Campbell and Vinci [1984], p. 321. 
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sense of a ‘filter’ on all the various alternatives to h which the investigator 
will entertain with non-vanishing prior probability, t.e. the only alternatives 
to h will by hypotheses which also explain e. In such a situation we would 
clearly arrive at the result 


pr(e/~h&a) =1. (3) 


So the condition (2) is not trivially satisfied. In the case where h is designed to 
explain e, it is replaced by (3). And this demonstrates (3) as a necessary 
condition for ad hocness. In my [1978] I argued that it could reasonably be 
regarded also as a sufficient conditon.! 

Of course under the ‘filter condition’, where (3) holds, we have pr (e/a) 
= 1.” This probability assignment arises, however, not from the knowledge 
comprised in a, but from the operation of the heuristic ingredient 
incorporated in the ‘filter condition’. 

Let us now see how the two counterexamples fare with the Redhead 
condition interpreted as (2) rather than (1) (again remember we are 
following the Campbell and Vinci usage of the various symbols involved). 


(1) The Case of Evidence not Known but Probable? 


For the prediction of Brownian motion in ‘novel’ liquids I agree that (1) 
fails, but (2) does not. So under the ‘correct’ interpretation of the Redhead 
condition we get no counterexample by remarking that Brownian motion in 
‘novel’ liquids confirms statistical thermodynamics on the grounds that the 
latter theory was not designed to explain Brownian motion even in liquids 
which were not ‘novel’. 


(2) The Copycat Case* 


This is an interesting example which requires some careful discussion. Dr 
Original shows that k implies e (in the presence of a we may assume). Then 
Dr Copycat ‘cooks up’ h** to explain e, not because he believes e to be true, 
but because he does not want to be ‘scooped’ by Dr Original if e does turn 


1 Campbell and Vinci give a curious logical analysis of the argument of my [1978] on pp. 330-1 
of their paper. For example they are very puzzled as to why if k is designed to explain e then 
pr(e/~ h&a) = 1. They refer to this as an assumption in the argument, when in fact it follows 
from the definition of ad hocness employed in my [1978]. They go on to claim that I confuse 
explanatory relevance with probabilistic relevance. I would simply deny this charge. If h is 
designed to explain e, it is still explanatorily relevant to e, although not probabilistically 
relevant. This is quite consistent with the remark on p. 357 of my paper that in the ad hoc 
situation ‘the explanation of e is guaranteed independent of whether A is true or false’. [I have 
replaced T by h in this quotation for consistency of notation]. All this means is that all 
alternative hypotheses considered with non-vanishing prior probabilities are also explanatorily 
relevant to e! 

2 This result in no way contradicts the probability axioms and hence is quite consistent with 
coherence. It would only be unacceptable to the personalist Bayesian if he imposed the 
stronger condition of strict coherence, which would imply pr (e/a) = 1 only if a > e. 

3 Campbell and Vinci [1984], p. 323. 

“ Campbell and Vinci [1984], p. 323. 
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out to be true. Campbell and Vinci claim if e zs observed to be true, then h** 
will receive no confirmation. It seems to me that everything depends on 
what we take Dr Copycat’s attitude to be to possible alternatives to h**. If 
Dr Copycat refuses to entertain alternatives to h** that fail to explain e, can 
we conclude that Dr Copycat has no degree of belief in such alternatives? 
The importance of distinguishing between ‘entertaining’ and ‘believing’ lies 
in the fact that the personalist Bayesian analysis on which my [1978] was 
based is really an account of rational dynamics of degrees of belief in the light 
of evidence. If Dr Copycat believes in the possible truth of, but does not 
entertain, alternatives to h** which fail to explain e, then (2) may be satisfied 
and I think confirmation should accrue to h** if e is observed to be the case. 

Confirmation would only not accrue if Dr Copycat equated his beliefs with 
his other disingenuous psychological attitudes towards hypotheses. With 
the example as given there seems no reason to do this. 

In either event it is clear that no counterexample is provided to the 
applicability of the Redhead conditon in the form (2). 


MICHAEL REDHEAD 
Chelsea College, University of London 
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THE QUEST FOR THE ONE WAY VELOCITY OF LIGHT 


Nissim-Sabat’s [1984] analysis of the thought experiment through which he 
believes the one way velocity of light could be measured, contains a simple 
error, which renders his argument invalid. The expression for t, (p. 64) will 
only be correct if £ = 1/2. For other values of g, the correct expression is T3 
= —AT+(2d/c)e. Elimination of AT from the two correct expressions for t4 
and t, does not yield an expression containing €. 

Since he does not indicate the origin of the expression for t2, I can only 
conjecture the error made in obtaining it. From the expression for t, (p. 63), 
we can infer that AT is the ‘e-desynchronisation’ of clock B with respect to 
clock A, that is the difference in reading of clock B from that of a clock at the 
same point in space as B but é-synchronised with clock A. If one assumes 
that, after the described interchange of positions of clock A and B, the g- 
desynchronisation of A with respect to B is —AT, then one would obtain 
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Nissim-Sabat’s expression for t, up to the sign. But this assumption is only 
correct if e = 1/2. That it cannot be correct for all ¢ follows once it is 
recognized that if it is true for € = 1/2, then it cannot be true for other values 
ofe. 

The quest for the one way velocity of light is beginning to look like the 
quest for a perpetual motion machine, for in both cases the fruitlessness of 
the quest can be demonstrated by quite elementary means. If the problem is 
set up in the manner of Winnie [1970] and Nissim-Sabat, then it reduces to 
the simple question of whether special relativity can be formulated in certain 
‘e-Lorentz coordinate systems’ rather than just the ‘Lorentz coordinate 
systems’ used in the familiar standard formulation of the theory.’ That this 
is possible has been known in principle since as early as 1913, when Einstein 
introduced techniques which would enable special relativity to be for- 
mulated in arbitrary spacetime coordinate systems. The quest for the ‘true’ 
value of ¢ and the (coordinate dependent) one way velocity of light which it 
determines, is as fruitless as the quest for the subset of ‘true’ coordinate 
systems in which special relativity can be formulated. For this task, all 
coordinate systems are equally viable. 

When one formulates special relativity in coordinate dependent terms, it 
is necessary to stipulate which coordinate systems are being used. In this 
sense, it is trivially true that any coordinate dependent formulation of 
special relativity must contain conventions. If we call events ‘simultaneous’ 
if their time coordinates have the same value, then the part of these 
coordinate specifying conventions which determine the time coordinate can 
be regarded as a synchrony convention. Clearly then, any standard or g- 
formulation of special relativity must at some point introduce this type of 
synchrony convention. 

However this coordinate dependent notion of simultaneity is not espe- 
cially interesting. The real question is whether there is any other sense in 
which there is a conventionality in the intrinsic spacetime structures of 
special relativity, associated with simultaneity. That there is not is suggested 
by a celebrated result of Malament [1977]. In Minkowski spacetime, 
standard synchrony is the only non-trivial equivalence relation suitable for 
the simultaneity relation definable between events by the relation of causal 
connectibility. Critics of the conventionality of simultaneity will find more 
comfort in pursuing this result, rather than in the vain quest for the one way 
velocity of light, and are referred for ammunition to recent discussions in 
Torretti’s [1983] and Friedman’s [1983] books, both of which are under 
review for this journal. 

JOHN NORTON 

Department of History and Philosophy of Science 

University of Pittsburgh 

1 If (t, x, y, 3) is such a Lorentz system, in which the Minkowski metric has the form diag 


(c?, —1, — 1, — 1), then a typical e-Lorentz system is (t’, x, y, x) where t = t-+(ze—1)x/c and 
o<e<x, 
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NISSIM-SABAT ON THE THE ONE-WAY VELOCITY OF LIGHT 


In a recent paper Charles Nissim-Sabat [1984] has described a 
Gedankenexperiment to measure the one-way velocity of light. He con- 
siders two identical isochronous clocks A and B located at the points a and f 
on the x-axis, with d = x,—x, and he assumes an interval time to be located 
at a. When each of the two clocks reads a specific timer, say 10.00 am, light 
signals will be emitted. The interval timer measures the time difference, 7,, 
between the two pulses. Immediately after these emissions the clocks will be 
moved. A is moved with the constant round trip velocity, v, (i.e. measured 
with standard synchronisation) to the point y on the x-axis (x, > xg > x4). 
From y A is moved with the round trip velocity —v to B where it is stopped. 
Similarly, B is moved with the round trip velocities v and —v from f via} to 
a, where it is stopped. (All times of acceleration and deceleration are 
supposed to be negligible.) When A and B arrive at $ and « light signals are 
emitted from the two points to the inverval timer, which measures the time 
Tz. Nissim-Sabat claims that the one-way velocity of light can be calculated 
from the measurements of t, and T2. I intend to argue that this is a mistake. 

The one-way velocity of light along the positive x direction is supposed to 
be c, = c|(1 +a) and along the negative x direction it is c- = c/(1—a), 
where a is a number between —1 and 1. (These formulae correspond to 
Reichenbach’s expressions for € = $ (1 +a).) According to Winnie [1970] 
the velocities along the positive and negative x directions are 


v, =vc|(c+va) and v_ =vel(c—va) 


where v is the round-trip velocity. 

Assume that a time-coordinate, t, is introduced corresponding to the 
above constant of anisotropy, a, and that A reads the #-time t, at the 
beginning of the experiment. According to the assumption B also reads t; 
when the first light signal is emitted from it, but since the two clocks are not 
necessarily synchronised, it must be assumed that the f-time for the 
emission for B is tz = ti +AT, where AT is some constant. The first 
measurement of the interval timer becomes 


Tı =t,4+d/c_—t, =AT+d(1—a)/c. 
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In order to calculate the other value to be measured with the interval timer 
we have to find the t-times corresponding to the arrivals of A and B at B and 
a, respectively. If follows from the above assumptions that A arrives at f at 
the time 


t3 = ty +(x,—x,)/v4 +(x,—x,)/v_ 
and that B arrives at « at the time 
t4 = ty +(x,—x,)/v4 +(x,—x,)/v_. 
In consequence the second measurement of the interval timer should be 
T2 = t4— (t3 +d]c_) = AT—d/v,4+d/v_—d/ce_ = AT—d(1+a)/c. 
This means that 
Tı —T2 = 2d/c. 


Hence this difference provides no information about the one-way velocity of 
light, although it depends on the value of a according to Nissim-Sabat, who 
states that t3 = AT—2d(1—8)/c. In my symbolism this means that t, = 
AT—d(1 —a)/c which contradicts the above value. If he were right it would 
be possible to measure the constant of anisotropy, a. But he is not! 

In order to calculate t} Nissim-Sabat uses the reading of the two clocks. 
He states correctly that the clocks would measure the time 


Tap = d(x —v*/e*)'/2 Ju 


when moved from «@ to $ or from f to a, and that the clocks would measure 
exactly the same time t, for the segment B—>y— 8 of their trip. He 
concludes: “Thus each clock would send out a pulse at a time t,,+1, later 
than it sent out the original (‘‘10 am”) pulse.’ But this is only true if ‘time’ is 
understood as ‘the readings of A and B’. It appears, however, that Nissim- 
Sabat believes that since the two pulses are both sent Tag +7, later than the 
first pulses according to the reading on the respective clocks, the difference 
in t-time between the last pair of pulses will equal the difference between the 
first pair of pulses, i.e. AT. But this is certainly a mistake! In order to show 
this the general Lorentz transformations are needed [Øhrstrøm 1980]: 


x’ = (x—v(t—ax))* (1 —v?/e”) 1? 
t = ((1—va’)(t— ax) + (a' —v)x) + (1 —v7/c?) 7 1? 
where a’ is the constant of anistropy in the system S which is moving with 
the round trip velocity v along the x-axis. 
It is easy to show that it follows from these transformations that the time 


measured by A while it moves from « to f relates to the corresponding 
interval of t-time in the following way: 


(At)eg = Tag(t t+va)/(1—v7]c7)"7, (Atap = Tap- 
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Similarly, the time measured by B while it moves from f to @ relates to an 
inverval of t-time: 


(At) a = Tap(1 —va)/(x =o], (At')pa p= Tap: 


It follows from these equations that (At’),, = (At’)g2 = Tag does not imply 
(At) ys = (At)s, as Nissim-Sabat seems to believe. 

I do not know of any method by means of which the one-way velocity of 
light can be measured. The method of Nissim-Sabat can certainly not be 
used for that purpose. 


PETER @HRSTROM 
University of Aarhus 
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Reviews 


BENENSON, F. C. [1984]: Probability, Objectivity and Evidence. Routledge 
and Kegan Paul. Hardback £19.95. Pp. xii+284. 


This book presents the most sophisticated defence to date of the logical 
relation theory of probability. In part the sophistication is due to Benenson’s 
use of Dummett’s views on language as a point of departure for a theory of 
the meaning of probability statements. A realist equates the meaning of a 
sentence with its truth-conditions; an anti-realist equates the meaning with 
its assertion conditions, t.e. the conditions under which, or the evidence in 
the light of which, the sentence may be justifiably asserted to be probably 
true. Benenson elects to give an anti-realist account but claims that his 
version of the logical relation theory stands independently. He does not give 
his reasons for this choice. 

Objective accounts of probability, e.g. propensity theories, make the truth 
or falsity of probability statements unknowable and hence conflict with what 
Benenson takes to be the fact that finite amounts of evidence can warrant 
conclusions on the truth of probability statements. At best such accounts 
make probability statements probable, either in the same sense hence 
generating an infinite regress, or in a different sense, in which case—anti- 
realism!—the meaning of one kind of probability statement is given in terms 
of probability statements of another kind whose meaning has yet to be 
explained. The anti-realist must adopt a position more stringent than 
Dummett’s and supply a finite procedure for deciding the truth or falsity of 
probability statements on the basis of evidence. As probability statements 
are true or false independently of any individual the logical relation theory is 
the only plausible candidate. According to this theory probability is 
relational, a probability statement asserting the extent to which a body of 
evidence entails an event of some sort. (Benenson does not commit himself 
as to whether probability applies to events or propositions.) 

Normally probability statements are not explicitly relational, no evidence 
is mentioned. Such ‘ordinary probability statements’ are, we are told, 
elliptical. An ordinary probability statement is true if the total available 
evidence partially entails the event in question to the extent indicated by the 
numerical value occurring in the ordinary statement; it is false otherwise. 
I.e. the meaning—the assertion conditions—of ‘p(a) = r’ are as follows: ‘e’ is 
the total available evidence and p(a/e) = 7, i.e. ‘e’ entails ‘a’ to the extent r, 
where this condition is purely logical. The temptation is to take ‘p(a) = 7’ as 
independently meaningful and Benenson’s formulations do not always help 
one to resist it. 

Evidence is of two kinds—namely, specificatory and statistical. The 
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former is variously given as ‘evidence of the material properties relevant to 
the outcome of a certain event—for example, evidence of the weight and 
shape of a coin when we are concerned with the chance of it landing heads on 
a particular occasion’ and the ‘specific set of characteristics’ shared by the set 
of trials whose results are reported in the statistical evidence—the frequency 
with which the phenomenon in question has occurred in the trials. 
Henceforth I shall use ‘e’ to represent specificatory and ‘s’ to represent 
statistical evidence. 

The main problem for the logical relation theory is to explicate ‘total 
available evidence’. If this is understood in a subjective sense then the 
objectivity of ordinary probability statements is contradicted. ‘Total 
available’ must be defined objectively. Its definition for both kinds of 
evidence and the justifications take up a large part of the book. In particular 
the revisibility of ordinary probability statements is discussed at length. 
Suffice it to say here that the argumentation is sound and avoids the 
objections usually aimed at logical relation theories. The total available 
evidence at a given time is all the evidence which can in practice be 
ascertained using the technology existing at that time. The total available 
statistical evidence is never obtained: life is too short to sit around tossing 
coins all day. On the other hand, no matter what the relative frequency 
would be in the total available evidence we can show mathematically by 
using, say, the Chebyshev inequality, that there is a high probability that the 
relative frequency in any small sample actually obtained is close to this 
notional relative frequency. Within the logical relation theory Benenson has 
found a role for statistical inference. 

The mention of relative frequencies may lead one to doubt that the 
Chebyshev inequality as interpreted above can be accommodated in the 
theory. That it can is due to Benenson’s definition of probability. He adopts 
the straight rule—corresponding to the value zero for 4 in Carnap’s 
continuum—thus, if the statistical evidence s for trials of a kind detailed in e 
entails a relative frequency r for outcome a, p(a/e&s) =r. This is the 
logically true, a priori, evaluation of the probability. As s constitutes 
empirical evidence it may be considered doubtful that the probability is a 
priori. Because Benenson includes s as part of the second argument nothing 
more than a, e and s needs to be known in order to determine the value of 
p(a/e&s). In particular it is not necessary to know that s is true, so the 
determination of the probability is indeed a priori. It is more difficult to 
make sense of the claim that this is a matter of logic. The straight rule is one 
of any number of conventions which could be used. Benenson gives two 
plausibility arguments for rejecting values of A other than zero but gives no 
indication why all measures but one are utterly beyond the pale, as they must 
be if this is a matter of logic. Referring to Carnap’s system, positive values of 
A make the probability depend on the whole language, a feature not 
appreciated in explications of ordinary probability statements. 

` Furthermore, only if A = o are probabilities undefined in the absence of 
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statistical evidence; the use of unbiased methods in science accords with 
this. (Benenson’s account states that the logical relation is only defined if 
there is statistical evidence but this is intuitively wrong: if e entails a unique 
outcome a then, surely, p(a/e)=1 independently of any statistical 
evidence.) 

Certain obscurities remain with this logical relation theory. Benenson 
criticises attempts to forge an operational definition of probability from 
Neyman-—Pearson confidence intervals partly because these are intervals of 
real numbers whereas the probability calculus shows probabilities to be 
single real numbers. In his theory if there is no statistical evidence the’ 
probability is undefined—which version of the probability calculus allows 
that? Defined probabilities in the theory are exact and rational-valued. The 
use of continuous distributions in science gives rise to irrational-valued 
probabilities. Although, unlike most logical relation theories, this one easily 
accommodates demographic and actuarial statistics, it is not at all clear how 
probabilistic predictions such as occur in quantum mechanics are to be fitted 
in. Ifo <r < 1 anda theory predicts that p(a) = r, this cannot be interpreted 
as saying no matter what the size of the sample the relative frequency is 
exactly r. 

For a number of reasons the claim that this theory is independent of the 
anti-realist view of meaning is to be doubted. Suppose that the logical 
relation theory can be used to explain scientific uses of probability. In that 
case the realist is not interested in the partial entailment of an outcome by 
evidence, but rather by evidence and the laws of nature which obtain in this 
world. To this it may be objected that for the realist s is constrained by the 
laws of nature, but then, in what sense is p a logical relation? How can there 
be probabilistic laws of nature if p is a matter of logic? Of course, for an anti- 
realist ‘laws of nature’ are of our making and should have nothing to do with 
an objective probability relation. 

A formal problem arises from the fact that a binary probability function 
maps elements of B? into [o, 1], where B is some Boolean algebra. Boolean 
algebra is the algebra of classical logic which for the anti-realist is the logic of 
decidable propositions. If a proposition is not decidable then there can be no 
relative frequency of success in a finite sample. So, for example, there can be 
no assignment of inductive probabilities to universal generalisations. Either 
the logical relation theory here described does not suffice to capture all the 
uses of probability, contrary to Benenson’s contention, or one is forced to 
uphold an extreme anti-realism, denying meaning to all but decidable 
propositions. 

The last two chapters deal with the principle of indifference. The 
principle states necessary and sufficient conditions for the identity of 
probability values. Since one way of providing semantics for a concept is to 
give identity conditions for it, the principle is construed as a rudimentary 
semantic definition for comparative, f.e. qualitative, relational probability 
statements. Again the relation obtains between an outcome and a body of 
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evidence, including statistical evidence. The most interesting feature 
of Benenson’s account is the rejection of the principle in the absence of 
evidence. In the absence of statistical evidence contradictions ensue, as 
Bertrand’s paradox makes clear. Only by picking one set of predicates, a 
choice which in the absence of evidence is arbitrary, can the paradoxes be 
avoided. Again the use of A = 0 is vindicated for otherwise a positive value of 
A, representing the ‘weight’ attached to logical ‘width’, must be chosen and 
that factor, in the absence of evidence is arbitrary. 

Stylistically the text is repetitive and flat. In places formal notation could 
have been used to advantage in making the meaning clear. On the few 
occasions when symbols are used they are mostly unexplained and printed in 
the body of the text to the detriment of legibility. For the most part the 
material is non-technical and requires no previous knowledge of the subject. 
In view of this it is unfortunate that Carnap’s ‘L-true’ is introduced without 
explanation, and that the reader is assumed to know that q = 1—p in the 
statement of Chebyshev’s inequality. 

In short, a book which makes a number of interesting points and which 
develops a logical relation theory immune to many of the standard 
criticisms. On the other hand, more work should have gone into spelling out 
the philosophical presuppositions, especially the extent to which the theory 
is anti-realist, and more attention been paid to the formal aspects of the 
theory, the disclaimer in the introduction notwithstanding. 


PETER MILNE 
London School of Economics 


R. BROWN [1984]: The Nature of Social Laws. Machiavelli to Mill. 
Cambridge University Press. ix-+270 pp. £22.50. 


This book is a dense, high-powered but somewhat irritating contribution to 
the debate as to whether social life can be studied by the methods of natural 
science. This debate has raged strongly in recent decades, and its impact has 
recently been clearly demonstrated by Sir Keith Joseph’s decision to 
rename the Social Science Research Council: as is well-known, it has 
become merely the Economic and Social Research Council. Brown holds, 
on the evidence of this book, no particular brief for Sir Keith; but one 
wonders whether the politician might, as is apparently his habit, yet 
recommend the book to his civil servants. 

The originality of the book lies in its historical treatment. Most recent 
analyses of the debate mentioned have, perhaps naturally, concentrated on 
the late-nineteenth century German dispute between Natur- and 
Getsteswissenschaften; in contrast, Brown considers proponents of a science 
of society from the sixteenth to the mid-nineteenth century. His stated aim 
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is to characterise, trace and criticise those notions. Brown is a fine 
intellectual historian, and he is well aware, for example, of the very 
considerable intellectual legacy of medieval Christendom, and comes close 
to the position advocated variously by Max Weber, John Milton and 
Michael Cook and Patricia Crone:! namely that European intellectual 
advance was made possible by a combination of the Judaic emphasis on the 
other worldliness of God with the Greek stress on laws of nature, or, in other 
words, by a situation in which the world was held to be regular but in need of 
investigation. But, despite his intention (p. 23), he does not really explain 
why this combination became so powerful in intellectual life in the sixteenth 
and seventeenth centuries. ; 

Instead Brown asserts that the first attempt at creating social laws was that 
made by the early economists’ move from isolated generalisations to the 
establishment of the interconnection of an economic system. Brown has 
considerable praise for the genuine, but ultimately still flawed, development 
pioneered by the School of Salamanca and by Mun and Misselden. He notes 
that the latter were helped to their conclusions by employment by the East 
India Company and the Merchant Adventurers: ‘this demonstrates . . . how 
advantageous it is for the growth of systematic thought that the investigator 
be able to find a general theoretical solution to his immediate practical 
problem’ (p. 47). Brown then. elegantly dismisses Hobbes’s naturalistic 
foundaton for social laws on the grounds that the argument is designed both 
as description and justification, that is, Hobbes, counter to his psychology, 
presumes that the reading of Leviathan will rationally convince actors of the 
justice of government. When considering Hobbes, he does, however, argue 
that there may be something for the study of the unintended consequences 
of the actions of individuals; but he does not wish to combine this, as does 
Popper, with methodological individualism, as is clear from his comments 
on Hume, to whom he devotes a slightly less powerful chapter. Hume’s 
account of the passions is rejected because of its simplicity but Brown does 
not as a result resort to any kind of rational man theory: rather he insists in 
sociological guise that the aims of human action depend upon the culture 
and values of particular societies. 

Interesting accounts follow of the stage theory of the late-eighteenth- 
century French philosophers and of the earliest neo-classical economists, 
i.e. Ricardo and Nassau Senior. Undoubtedly the best chapter in the whole 
book, however, is that devoted to the work of Vico, a figure one might expect 
Brown to warmly endorse. Arguing against Sir Isaiah Berlin, Brown 
convincingly demonstrates that the method of historical knowledge does 
not, at least without the invocation of the workings of Providence, allow the 
establishment of first principles somehow immune to empirical refutation, 


1 This theme runs throughout Weber’s work, and no single reference is appropriate. But see M. 
COOK and P. CRONE [1977]: Hagarism, Cambridge University Press; and J. MILTON [1981]: 
“The origins and development of the concept of the laws of nature’, European Journal of 
Sociology, 22. 
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and thus a secure starting point for a humanistic science of history. The final 
two chapters are clear, but are directed against much better known targets. 
Comte’s three stage historical theory is criticised on Popperian grounds; 
although some of these have been rightly questioned by Peter Urbach,! 
Comte’s failure to specify a mechanism or mechanisms accounting for the 
transition from one stage to another does render his scheme largely vacuous. 
The final chapter on Mill is thorough and dwells largely on the almost total 
failure to create the promised understanding of human motivation, i.e. the 
‘science of ethology’. 

Brown’s critical analysis is perhaps too easily achieved. He does not give 
us a full history, but selected episodes. Adam Smith and Karl Marx 
would have served as weightier proponents of their art—the latter, for 
example, improved on Comte in offering a mechanism (unfortunately false), 
that of class conflict, explaining the transition from one stage to another. 
More worrying, Brown occasionally seems to be shuffling the cards in his 
pack of judgement. He notes interestingly that it is possible to generate four 
types of social science by combining the debate concerning non-rational vs 
rational behaviour with that as to whether social life should be studied at the 
level of the individual or the collectivity (pp. 255-6), and he perfectly fairly 
notes that his assembled authors could never agree upon which approach 
was finally appropriate. Here he sounds like Thomas Kuhn, berating the 
paradigmless state of social science. But, as noted, he also draws upon 
Popper extensively, as well as others, and one feels that against such an 
armoury, possibly not always internally consistent, no early proponent of 
social science has much of a chance. Finally it must be noted that Brown 
occasionally switches between condemning the search for laws of history and 
the search for social laws per se; but he realises this himself, and the last 
chapter of his book tries—but in only thirteen pages]—to outline his own 
position, and to justify the concluding sentence of the book, which claims 
that the debate between humanistic and scientific approaches to society can 
‘safely be laid to rest’ (p. 264). What is Brown’s own position, and to what 
extent does it hold water? 

Brown’s makes two points and space does not allow him to work out their 
implications and interrelationships. Firstly, he argues that much social 
behaviour is to be seen as rule-bound t.e. that human action follows the 
norms of particular cultures thus making, for example, comparative politics 
effectively impossible. This idealistic metaphysic is very well-known in 
modern social theory, but Brown’s sophistication is evident in his being 
aware of objections that can be brought against it. He thus admits that the 
creation of a particular set of norms may itself be causally explained, and he 
notes too that certain cultures may contain contradictions and omissions 
such that no orders are issued from the commanding heights of our 
conceptual equipment. This is a very large matter and I certainly accept that 


1 P, URBACH [1985]: ‘Good and Bad Arguments Against Historicism’ in Popper and the Human 
Sciences, ed. by Gregory Currie and Alan Musgrave. Forthcoming. Nijhoff, The Hague. 
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there is evidence that conceptual constraint does sometimes exist. But 
fundamentally Brown’s approach seems to me misguided. To understand 
ideologies it is necessary also to ask about social forces that enable them to 
spread, and to be maintained. One secret of the latter is that large ideologies 
are often highly flexible in content; they tend to be loose and baggy 
monsters, full of saving clauses allowing particular groups to articulate their 
own interests, and not the type of seamless wonders enforcing pure 
conceptual constraint that Brown seems, in the end, to envisage. It would be 
very easy to make many more points against the idealism implied in Brown’s 
concern with rules guided behaviour, but one must suffice. He notes himself 
the presence of rule-breaking, but fails to note how systematic this often is in 
social life. It seems to me a great mistake to even wish to read off a social 
order from a set of rules, as seems to happen routinely. 

Brown’s second general contention is, interestingly, that there neverthe- 
less remains room for the search for strong social generalisations. These 
words are carefully chosen. At times, and particularly when asserting that 
real social knowledge will only come when ecological and other factors are 
combined with, rather than falsely separated from, social ones, he sounds as 
if he is issuing a new call for the search for social laws. But he remains 
opposed to the search for historical laws, and believes that strict social laws 
will perhaps be unobtainable given that the problematic of the social 
sciences is likely to be dictated by socio-political rather than scientific 
criteria. 

As it happens, I think that we do possess some social laws, and I am not 
prepared to write off the search for historical laws either. It would be foolish 
to deny some of his points, and others made ably by John Stuart Mill. The 
social sciences are not a cognitive success story, and I grant that strict 
historical laws will not be found. The rise of the West took place once and 
thereafter changed the rules for development in the rest of the world 
(roughly: they had something to copy); all that can be offered in con- 
sequence is plausible historical reconstruction. Similarly, there can be no 
general theory of the world religions: there are but four major cases and each 
of these gelled with civilisations that differed from each other in a myriad of 
ways. But the fact that the task is Sisyphean does not thereby make it the less 
necessary, not invalidate the insistence on such strictness as is possible: nor 
does the fact that generalisations in the past were poor mean that we cannot 
improve upon them now. Bluntly, all societies have views of their place in 
history, and much hangs on whether, for example, we consider ourselves as 
members of a capitalist, a liberal capitalist or a democratic society. It is as 
well to be as open as possible about one’s philosophic history, otherwise 
various assumptions are covertly smuggled into social practices. This too is 
a large matter which I have discussed fully elsewhere.' With reservations, a 
strong case can be made against the critics of historicism: there is no reason 
to believe that all philosophies of history lead to dangerous social engineer- 


1 J, A. HALL [1985]: Powers and Liberties, Blackwell. Chapter one and passim. 
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ing, albeit some have done, and there are very good arguments to be directed 
against key tenets of the critical case, notably against the view that the 
growth of knowledge undermines historical prediction. 


JOHN A. HALL 
University of Southampton 


LEWIS, DAVID [1983]: Philosophical Papers Volume r. Oxford University 
Press. Hardback £20.00. Paperback £8.95. xit+285 pp. 


David Lewis is certainly one of the most influential philosophers of our 
time, and in this impressive collection we have fifteen of his papers on 
ontology, philosophy of mind, and philosophy of language. Papers on 
counterfactuals, causation, and the like will follow in a second volume. All 
the papers have been published before, and many are very familiar indeed. 
For example, ‘How to Define Theoretical Terms’ is a canonical text on 
functionalist definitions, and there can scarcely be a student of the 
philosophy of language who is not familiar with the contents of ‘Languages 
and Language’ and ‘General Semantics’. But the collection is still im- 
mensely valuable, for a number of reasons. First, and most obviously, it 
renders the papers—familiar and less well known—easily and economically 
accessible. Second, it includes Postscripts to eight of the papers. Third, it 
provides a very clear view of the Lewis way of doing philosophy—a 
methodology made explicit in the Introduction and in ‘Holes’ (written 
jointly with Stephanie Lewis). 

The most celebrated—or notorious— Lewis package, his extreme modal 
realism, dominates the first part of the collection. Its contents are laid out in 
‘Anselm and Actuality’ and ‘Counterpart Theory and Quantified Modal 
Logic’ —to both of which there are quite extensive Postscripts. The main 
idea behind counterpart theory is familiar and clear. What a modal sentence, 
say ‘Jx O Fx’, really says is that there is an object (a world-bound 
individual) x in the actual world, such that for some possible world w there ts 
a counterpart y of x in w such that y is F. In this translation, there is an 
existential quantifier over worlds, corresponding to the operator ‘©’ in the 
original, and a further existential quantifier ‘there is a counterpart ...’. This 
latter corresponds to nothing explicit in the original modal sentence so that, 
by Lewis’s lights, the language of quantified modal logic (QML) obscures 
the real truth about modality. Similarly, there is an extra universal quantifier 
‘for every counterpart y of x in w’ in the translation of, say, ‘Sx Q Fx’. But 
though this much, and more, is clear, there are in the literature many queries 
about details. 

One problem that was mentioned by Lewis in the original paper (pp. 31~2 
in this volume) is that on this translation it is true of any actual object that it 
necessarily exists. Essentially this worry was pressed by Allan Hazen [1979], 
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and although Lewis responds to many of Hazen’s points in Postscripts, this 
one goes unanswered. It seems to me, in fact, that the worry is not 
intrinsically connected with counterpart theory; it is, rather, a general 
problem with what is called weak necessity. The interpretation of ‘[]’ as 
expressing weak necessity makes for the smoothest formalisation of 
essentialist claims (such as ‘Socrates is necessarily a man’). But it raises 
apparently insuperable difficulties in expressing the difference between 
Socrates and, say, numbers in point of necessary vs contingent existence. 
The main idea for an alternative must be to allow that ‘TJExists(Socrates)’ 
says that Socrates exists necessarily—which is presumably false—while 
‘CI(Exists(Socrates) — Human(Socrates)) expresses the plausible essential- 
ist claim. In applying the idea to counterpart theory, the wrinkle is that we 
need to allow that an object x can have a counterpart y relative to a world w, 
even though y does not exist in w. (For some details, see Forbes [1982], 
[1983].) 

It is a feature of counterpart theory that a single object in one world can 
have two or more counterparts in some other world, and vice versa. This has 
a number of problematic consequences: one concerns the necessity of 
identity, another the interpretation of an ‘actually’ operator. (See again 
Hazen [1979].) The problem about identity is, of course, that there may be x 
and y in the actual world such that x = y, and there may be a world w, a 
counterpart v of x (that is, of y) in w, and a counterpart z of y (that is, of x) in 
w, such that v # z. But in that case the modal sentence 


dxiy[x=y & (x #Y)] (1) 


comes out true. 
In a Postscript (pp. 45-6), Lewis responds to this in an interesting way. 
First, he accepts the schema (Leibniz’s Law) 


VaVy [xe = y > (—x— + —_— _)]. (2) 
Then, he denies that the sentence 
VxVy[x = y> (Ox Foyer Oy #y)] (3) 


is an instance of this schema. The reason is that what ‘© x # y’ says of x is 
that for some world w it has a counterpart in w not identical with some 
counterpart of y in w. And this is not what ‘© y # y’ says of y. Because there 
are two occurrences of the same variable, what it says of y is that for some 
world w it has a counterpart in w which is not self-identical. The point is that 
on Lewis’s translation scheme ‘© x # y’ conceals two extra quantifiers—one 
for each variable—while ‘© y # y’ conceals only one. 
Thus, Lewis accepts 


' Vx] (x = x) (4) 
but rejects 
VxVy[x = y > O (x=y)]. (Seep. 36) (5) 
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However, a different translation scheme (not obviously inferior) permits an 
alternative line of response. (See again Forbes [1983]). We can allow that (3) 
is an instance of (2), but reject (4), if we modify the translation scheme so 
that it is occurrences of variables that matter. Then, for example, if in world w 
there are distinct counterparts v and z of an object x in the actual world, then 


dx © (x # x) (6) 


comes out true. If we follow this alternative path, we shall presumably want 
to distinguish (6) from a sentence that says (absurdly) that there is an object 
in the actual world which has a non-self-identical counterpart in some 
world. To mark the distinction we shall, it seems, need to enrich the notation 
of QML with something tantamount to predicate abstraction. 

These slightly technical reflections prompt some more general questions. 
To what extent can we make sense of the examples involved, without being 
extreme modal realists? Can we get our minds around the idea that (1) or (6) 
might be true, without making ineliminable use of the apparatus of possible 
worlds and counterparts? If we cannot, should we accept Lewis’s package, 
or deny that the sentences could be true? 

On Lewis’s view, conceptual priority attaches to quantification over 
possible worlds, rather than to the modal operators. If, on the other hand, 
one holds that priority attaches to the operators, then one can proceed in 
either an eliminativist or a conctliationist way. On the eliminativist way, the 
idea is to enrich the original operator language so that everything legiti- 
mately expressible using quantification over worlds is expressible using 
operators. (The legitimation has to do with compatibility with the claim of 
conceptual priority.) Quantification over worlds is dispensable—strictly 
speaking, there are no possible worlds. The obvious query that the 
eliminativist faces is whether the enriched operator language is just a 
notational variant of a language of quantification over worlds. The 
conciliationist, on the other hand, accepts that, strictly speaking, there are 
possible worlds. He can also accept that, armed with an ontology of worlds, 
we can formulate conditions which could not be entertained given only the 
basic concepts expressed by the operators. And he can agree with Lewis 
[1973] that ‘Possible worlds are what they are and not some other thing’ (p. 
85). But he disagrees with Lewis by denying that possible worlds are things 
of the same kind as the actual world, in the sense of us and our surroundings. 
(See my [1983a], pp. 130-1.) 

The conciliationist view is, it seems, a version of what Lewis calls 
moderate modal realism. (The difference between the extreme and moderate 
modal realist is well described in a Postscript to ‘Attitudes De Dicto and De 
Se’, pp. 157-8.) So what is Lewis’s fundamental objection to the moderate 
view? It is that ‘we are left with theories . . . that no longer offer instructive 
analyses’? For, given what the moderate modal realist is envisaged as saying 
about the notion of truth at a world, ‘it becomes uninstructive to analyze 
necessity as truth at all... worlds’ (p. 158). But, of course, the conciliationist 
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started from the claim that conceptual priority attaches to the operators, and 
so is not going to feel threatened by that objection. There is a great deal more 
to be said on this topic, and doubtless a great deal more will be said in 
Lewis’s forthcoming book On the Plurality of Worlds. 

The second part of the collection begins with ‘An Argument for the 
Identity Theory’, which serves as a beautifully clear exposition of func- 
tionalist type-type identity theories. A complication is taken up in the 
relatively recent ‘Mad Pain and Martian Pain’. Briefly, the complication is 
that Lewis wants to allow both (1) that a physical state quite different from 
the state that is pain in us could be pain in a Martian, in virtue of having the 
right causal role, and (ii) that a state could be pain in (a mad) one of us despite 
lacking the right causal role, provided it was of the right physical type. How 
can Lewis combine these features of an individualist functionalist theory 
(for the Martian) and of a simple type-type identity theory (for the 
madman)? His solution is to make use of the notion of a causal role for a 
population. The Martian is in pain because he is in a state that, with few 
exceptions, occupies the right causal role in Martians. The madman is in 
pain because he is in a state that, with few exceptions of which he ts one, 
occupies the right causal role in humans. But then, for any given individual, 
which is the appropriate population to consider? Lewis provides a number 
of conditions which, in the problematic cases, have to be balanced against 
one another. One condition is that ‘an appropriate population should be a 
natural kind’ (p. 157). Where does this condition come from? Here, I 
suggest, Lewis’s account might benefit from a little injection of teleology 
rather as, in my view [1983b], his account of perception [1980] can be 
improved by appeal to the notion of teleological function. The rough idea 
would be that a state is pain if it is its function to occupy the right causal role. 
Such an appeal to function would show the invocation of natural kinds to be 
not ad hoc. 

Amongst the papers on the philosophy of language, ‘Scorekeeping in a 
Language Game’ is perhaps slighter than many in the collection. The idea is 
that several different pragmatic phenomena, including presupposition, the 
interpretation of underspecific definite descriptions, and resolution of 
vagueness, are all subject to rules of a common form: rules of accom- 
modation. Very roughly indeed, if a certain continuation of a conversation 
requires for its acceptability that something be so—a presupposition be in 
force, an object be salient, the standard for a vague predicate be raised—then 
continuation of the conversation in that way suffices for it to be so. I have 
just two brief comments on this. First, it is not obvious that all the examples 
are instances of a common feature of conversation; explicit performatives, 
for example, might seem to be a conventionalised special case. Second, 
Lewis notes that in some cases accommodation is easier in one direction than 
in the opposite direction. Thus, in the case of vagueness it is easier to raise 
standards than to lower them, and in the case of relative modalities it is easier 
to include more possibilities as relevant than to include less. Lewis offers no 
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explanation of these asymmetries. It would be interesting, in the light of 
theories about what psychological processes are involved in utterance 
interpretation, to see whether the asymmetries can be accounted for in terms 
of general features of those processes; background information once 
accessed and used, for example, remains highly accessible. 

‘Truth in Fiction’ has considerable contemporary interest, and Lewis has 
added a number of Postscripts, one noting connections with the work of 
Kendall Walton—work whose influence is to be seen also in Gareth Evans’s 
chapter on existence statements ([1982], Chapter 10). In fact, a problem 
raised by Evans will serve to illustrate a general issue. 

There is more that is true in a fiction, or game of make-believe, than is 
explicitly stated to be so. How are these extra propositions incorporated? 
Evans and Lewis both agree that the incorporation works on lines similar to 
counterfactual conditionals. But Evans noted that this may present a 
problem for a possible worlds account of counterfactuals, because the 
antecedents of the conditionals are often, by essentialist lights, not possibly 
true. (‘If these globs of mud were pies . . .’ [1982], p. 335.) Lewis himself is 
not much of an essentialist, and could avoid the difficulty by judicious choice 
of the counterpart relation. But this kind of case illustrates the problems 
confronting anyone who picks selectively from a Lewis package. 


MARTIN DAVIES 
Birkbeck College, University of London 
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‘Twelve Questions about Keynes’s 
Concept of Weight 


by L. JONATHAN COHEN 


Introduction: Keynes’s concept of weight 

1 How Does Weight Come to Matter? 

2 Why Did Keynes Not Appreciate How Weight Matters? 

3 Is Weight Increased by Each Addition of Relevant Evidence? 

4 Do Arguments Inherit Weight Via The Entailments of Their Conclusions? 

5 Can Arguments in Co-ordinate Terms be Compared for Weight? 

6 Does Weight Have Any Limiting Cases? 

4 If One Premiss is of Greater Relevance Than Another, Does it Add More 
Weight? 

8 Is Weight Determined by Related Probabilities? 

g Can One Argument be Compared for Weight With Another if its Terms 
Neither Entail nor are Co-ordinate with the Other’s Terms and Neither 
Argument is a Limiting-Case? 

10 Can Weight be Ranked as Well as Compared? 

rr Isit Worth While Knowing the Weight of an Argument Without Knowing 
tts Probability? 

12 What is the Connection Between Keynesian Weight and Baconian 
Legisimilitude? 


INTRODUCTION: KEYNES’S CONCEPT OF WEIGHT 


In chapter VI of his [1921] Keynes treats the probability of H on E asa 
property of the argument from E to H. The probability depends for its value 
on the balance between the favourableness and unfavourableness of the 
evidence that E states in relation to H. But he considers that there may be 
another respect in which some kind of quantitative comparison between 
arguments is possible. “This comparison turns,’ he says, ‘upon a balance, not 
between the favourable and the unfavourable evidence, but between the 
absolute amounts of relevant knowledge and relevant ignorance. As the 
relevant evidence at our disposal increases, the magnitude of the probability 
of the argument may either decrease or increase, according as the new 
knowledge strengthens the unfavourable or the favourable evidence; but 
something seems to have increased in either case,—we have a more 
substantial basis upon which to rest our conclusion.’ Keynes expresses this 
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by saying that an accession of relevant evidence increases what he calls the 
‘weight’ of an argument. Thus the weight of an argument is independent of 
the correctness or incorrectness with which such-or-such a probability is 
assigned to the argument and is not necessarily determined by the probable 
error of the argument’s conclusion (where that conclusion assigns a value to 
a magnitude). Keynes says that, metaphorically, the weight of the argument 
from E to H measures the sum of the favourable and unfavourable evidence 
that E states for H, and the probability measures the difference. But he does 
not suggest any method by which weights might be measured and in fact 
admits that often one cannot even compare the weights of different 
arguments. He thinks that, ‘in deciding on a course of action, it seems 
plausible to suppose that we ought to take account of the weight as well as the 
probability of different expectations’. But he finds it difficult to think of any 
clear example of this, and he does not feel sure that the theory of evidential 
weight has much practical significance. 

In this paper I shall raise twelve questions about weight, in Keynes’s 
sense; and by answering those questions I shall try to show not only why a 
theory of evidential weight is needed, but also what form it should take if 
Keynes’s seminal intuition is to be preserved. I shall not, however, use 
Keynes’s own symbolism. Keynes used the letters ‘h’, ‘h,’, ‘h2’, etc. for the 
premisses of an argument, not its conclusion—which is confusing to a 
contemporary reader. So I shall use ‘E’, ‘E,’, ‘Ez’, etc. for premisses and ‘H’, 
‘H,’, ‘Hy, ete. for conclusions. Also he abbreviated ‘the probability of the 
argument from h to a’ as ‘a/h’, whereas I shall use instead the more 
conventional formula ‘p(H/E) for the probability of the argument from E to 
H. But I shall follow Keynes in speaking sometimes of the weight of the 
argument from, say, E to H, sometimes of the weight of the evidence E for 
H, and sometimes of the weight of the probability of H on E. This flexibility 
is harmless so long as one remembers that by speaking of the weight of the 
probability of H on E Keynes does not intend to treat weight as a property of 
certain propositions such as of the proposition p(H/E) = n. On his view it is 
possible to know the weight of a probability without knowing its value, or to 
know its value without knowing its weight. 


I HOW DOES WEIGHT COME TO MATTER? 


In calculating the premium that a client should pay for a life insurance policy 
maturing at age 65 a company that wanted to minimise the risk of a loss on 
this class of business would ideally determine the probability of the client’s 
death before the age of 65 on the basis of all the relevant facts, i.e. of all the 
facts that affect the probability one way or the other. In practice it may well 
be uneconomic to enquire too closely into the client’s health, ancestry and 
life-style, and some of the relevant facts (e.g. about grandparents’ medical 
histories) may be quite unobtainable. But at least the company needs to 
know the client’s sex, say, and whether he or she is at present ill or’ has a 
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particularly dangerous job or hobby. Since people who have a reason to fear 
early death are particularly likely to want life insurance protection in the 
interests of their dependents, a company risks bankruptcy if it does not take 
such reasons into account in its calculation of premiums to be charged. 
Equally it risks losing business to other companies if it does not offer 
appropriate reductions in premiums to clients who show obvious prospects 
of exceptional longevity. In other words the company must in each case have 
an appropriate weight of evidence for the probability of survival to 65 that is 
accepted as the basis for calculating an economic premium. Since an 
individual may have many relevant features that constitute items of 
favourable or unfavourable evidence about his survival probabilities, a 
reliable determination of these probabilities will need to be based upon a 
sufficiently weighty combination of evidential features. And just the same 
concern for the weight of relevant evidence would be needed if, instead of 
trying to predict whether their client will survive to the age of 65, the 
company were trying rather to explain why he has done so. 

Philosophers sometimes invoke, in this kind of connection, Carnap’s 
({1950], p. 211) requirement of total evidence. For a conditional probability 
judgement to be applied as the major premiss of a prediction or explanation, 
they say, it must be based on all the relevant evidence available. Or, if they 
prefer a relative-frequency theory of probability to a logical-relation one, 
they may invoke Hempel’s ([1965], p. 397) requirement of maximal 
specificity for the reference class. But in practice such requirements can 
rarely, if ever, be satisfied. Even to know just the available evidence 
(physical, meteorological, geological, astrophysical, epidemiological, socio- 
political, etc.) bearing on a person’s survival to 65 one would have to work 
away indefinitely, since what is not available to-day might, with sufficient 
effort, be made available to-morrow. And almost certainly, if one were 
thinking in terms of relative frequencies, one would soon be reduced to a 
reference-class of one member—the person himself—so that no statistical 
data could be compiled. What is important in practice, therefore, is for such 
a probability to have as much weight as is permitted by the methodology of 
estimation, by pertinent economic constraints, by the current state of 
scientific knowledge, by the nature of the subject-matter and by any other 
limits to enquiry. So at least comparative judgements of weight have to be 
made, and it may be worth while investigating whether the weights of 
different probabilities can also be ranked or measured. But we can afford to 
take rather less interest in theoretical ideals like the requirement of total 
evidence. In the real world we are occupied with what is better and what is 
worse, not with what is perfect. (Of course, Bayesian conditionalisation may 
be seen in Keynesian terms as a device for increasing weight. But it too does 
not issue in explicit comparisons, rankings or measurements of weight.) 

An illuminating way to look at the matter is this. Suppose that the 
estimated probability that a person will survive to age 65, on the evidence 
that he or she is a lorry-driver, is 0.8. This is a generalised judgement of 
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conditional probability, in the sense that it does not assert anything about a 
particular person. From it we can derive its instantiation for a particular 
person only on the assumption that our reference to that person does not add 
to or subtract from the evidence on which the probability is conditional. 
(Indeed without that assumption we could hardly derive its instantiation— 
as we can!—in counterfactual cases, i.e. even in the case of a person who is 
not in fact a lorry-driver.) But on the unconditional issue no such 
assumption is possible. If we want to infer the unconditional survival 
prospects of a particular person—say, Mr Smith—we cannot avoid the need 
to allow for the fact that Mr Smith may be specially circumstanced in some 
relevant way: lorry-driving may be his work, but hang-gliding may be his 
hobby. So inferability to a singular judgement of unconditional 
probability—to the judgement, say, that the probability of Mr Smith’s 
surviving to 65, is o.8—must depend on how much of the relevant facts 
about Mr Smith is included in the premisses. That is, this inferability varies 
directly with the weight of the evidence. If p(H/E) = n, then the weight of 
p(H/E) determines the strength of our entitlement to infer from E to 
D(H) =n. 

At any rate it is convenient to speak thus in the present context. But 
strictly speaking, and to avoid any possibility of confusion, we should note 
that, because of the considerations just mentioned, there is some equivo- 
cation here between ‘p(H)=7’ and ‘p(H/E)=7n’. In the latter—the 
singularised judgement of conditional probability—each occurrence of the 
referring expression ‘Mr Smith’ may be replaced salva veritate by an 
occurrence of, say ‘Mr Brown’, whereas in the singularised judgement of 
unconditional probability that is not so. Correspondingly the value of this 
p(H) is not related to the value of this p(H/E) by Bayes’ theorem. For, if 
‘p(H) = n’ had instead been used here to state a prior probability relative to 
‘p(H1/E) = n’, we should have had to grant the former the same substitu- 
tivity entitlements as the latter. Indeed philosophers who use the expression 
‘statistical syllogism’ to name an inference of the kind in question, from ‘EZ’ 
and ‘p(H/E) = n’ to ‘p(H) = n’, are speaking rather misleadingly. Genuine 
syllogisms, whether valid or invalid, do not equivocate thus between 
premiss and conclusion. 


2 WHY DID KEYNES NOT APPRECIATE HOW WEIGHT 
MATTERS? 


When we view the problem in the above terms it is easy enough to see not 
only how weight matters but also why Keynes was unable to appreciate how 


1 The difference between counterfactualisable and non-counterfactualisable probabilities is 
discussed in (Cohen [1986], §17~-19). A generalised conditional probability that is non- 
counterfactualisable has zero weight, since its value is the outcome of an accidental 
relationship and affords no basis for inference, in a particular case, to the value of an 
unconditional probability. 


Keynes's Concept of Weight 267 


it matters. Because he held that probability should be thought of as a logical 
relation (and because he apparently had no conception of deducibility from 
the null premiss), he could hardly allow a place for judgements of 
unconditional probability other than as mere ellipses of ‘ordinary speech’ 
([r921], p. 7). A fortiori he could hardly appreciate the existence of the 
problem of how to grade our entitlement to detach an unconditional 
probability from a conditional one. It is scarcely surprising therefore that 
Keynes admitted to doubting whether the theory of weight had any practical 
significance, since by his very treatment of probability as a function of 
ordered pairs of propositions he had cut himself off from the possibility of 
articulating the nature of this significance. 

Nevertheless it is very much to Keynes’s credit that he did not suppress 
his various intuitions about the nature of weight. And perhaps these 
intuitions were strengthened by the fact that, like most people, he could not 
altogether rid himself of the intuitive idea that probability can also be 
conceived as the relative frequency with which one set shares its member- 
ship with another, while he recognised that the weight of a relative 
frequency can be increased by a relevant partitioning of the reference class 
([t921], p. 27). Unfortunately, however, the relative-frequency theory of 
probability has a parallel difficulty in articulating just how weight matters. 
Just as the logical relation theory seerns not to assign probabilities to single 
propositions (as distinct from ordered pairs of propositions), so too the 
relative frequency theory, seems not to assign them to single sets (as distinct 
from ordered pairs of sets). 


3 IS WEIGHT INCREASED BY EACH ADDITION OF RELEVANT 
EVIDENCE? 


In the standard, probability-theoretic sense of ‘relevant’ E, is relevant to 
p(H/E,) if and only if p(H/E,&E,) # p(H/E,). But, if the weight of an 
argument is to continue to grow with each increment of relevant evidence, it 
must clearly also be taken to grow even with the addition of a certain kind of 
irrelevant evidence. This is because one might add the evidence E,&E, 
where E, alters the probability of the argument in one direction exactly as 
much as E, alters it in the other. In such a case the addition of E, on its own 
would increase the weight of the argument first, and then the addition of E, 
to the evidence would increase the weight yet further. So, since it should 
presumably make no difference whether Æ, and E, are added successively or 
conjunctively, the addition of the conjunction E ,&E, must also increase the 
weight of p(H/E,) even though it is not—in the standard sense—a relevant 
piece of evidence. Keynes therefore defined a proposition as ‘relevant’ to 
p(H/E,) in this connection if and only if it entails a proposition Æ, such that 
P(A/E,&E,) # p(A/E,). But it would obviously be simpler to retain the 
normal probability-theoretic sense of ‘relevant’, and then to say—using 
Keynes’s functor ‘V(.../---)’ for ‘the weight of the probability of ..., given 
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--~’—that V(H/E,&E,) > V(A/E,) if and only if E, entails a proposition 
that is relevant to p(H/E)). 

Unfortunately this will not quite do as it stands. There is a further 
difficulty, which apparently Keynes did not see. According to classical logic 
any proposition E, you like, however irrelevant to p(H/E), nevertheless 
entails the disjunction E, V H, and EF, V His certainly relevant to p(H/E,) 
because p(E, V H/H&E,) = 1 and so by Bayes’s theorem 


P(HI/E,) x P(E, V B/H&E,) ___p(HJE,) 
p(E, V HJE,) P(E, V HJE,) 


where p(E, V H/E,) > o. So Keynes seems to have trivialised the concept of 
weight by allowing any proposition to increase the weight of any argument. 
In order to avoid such trivialisation we need to tighten the conditions under 
which V(A/E,&E,) > V(H/E,). We need to say that this inequality holds if 
and only if Ẹ, entails a proposition £, that is relevant to p(H/E,), where no 
proposition E, occurs in Æ, (or in any equivalent of E,) such that E, entails 
E, and, without affecting the relevance of E, to p(H/E,), E, can be replaced 
in Æ, (or in some equivalent of E,) by a proposition that has no relevance to 
p(H/E,). And we can also say that under just these same conditions EF, will 
give at least as much weight to p(H/E,) as E, does. 


p(H/E, &(E, V H) = 


4 DO ARGUMENTS INHERIT WEIGHT VIA THE 
ENTAILMENTS OF THEIR CONCLUSIONS? 


We have seen in 3 how entailment between evidential propositions affects 
weight. The question also arises whether entailment between conclusions 
carries with it any necessary consequences for assessments of weight. But 
the answer must be that it does not. For we know that, where H, entails H,, 
E, may be relevant to H, on E, and yet not relevant to H, on E,, or to H, on 
E, and yet not relevant to H, on E, ({[Carnap], 1950, pp. 348-97). On the 
other hand it is certainly reasonable to assume, so far as weight is a grading of 
inferability, that the weight of an argument is unaffected when any 
proposition in its premisses or conclusion is replaced by another proposition 
that is necessarily equivalent to it. 


5 CAN ARGUMENTS IN CO-ORDINATE TERMS BE COMPARED 
FOR WEIGHT? 


The question now arises whether any comparisons of weight can be drawn 
between p(H,/E,) and p(H,/E,) when no entailments hold in either 
direction between E, and E,. It is tempting to claim, for example, that 
p(H,/E,) and p(H,/E,) should be said to have the same weight when each 
pair of propositions ascribes the same pair of predicates though to different 
individuals, and Keynes does claim this ([1921], p. 73). After all, the weight 
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of a probabilificatory argument ought surely to remain invariant under 
uniform exchanges of the individuals to which it refers, much as the validity 
of a causal or logical argument remains invariant under these conditions. 
But the same point is perhaps more accurately made by treating weight as 
being primarily a property of generalised conditional probabilities, and 
derivatively of their substitution-instances. Indeed, the weight of the 
evidence is the same for a generalised conditional probability as for each of 
its substitution-instances because the relevance of the evidence is the same 
(on the assumption that our reference to the instantiating individual does 
not add to or subtract from the evidence on which the probability is 
conditional), 

It is also possible to extend the theory of weight by assuming that the 
weight of an argument is unaffected when each occurrence of a particular 
predicate P, within its premisses or conclusion is replaced by an occurrence 
of another predicate P,, just so long as P, and P, are both members of the 
same family of co-ordinate but mutually inconsistent predicates. A pre- 
dicate might thus derive its weight-increasing potential for a given 
argument from the relevance to that argument of another predicate in the 
same family, or from its relevance to an argument formulated in terms of 
predicates that belong to the same families as the predicates in the given 
argument. 

What would be the justification for this? First, it seems reasonable that, 
when we have conducted an enquiry to find out whether a candidate for life 
insurance runs exceptional health risks in his leisure-time activity, we 
should be able to add the results of the enquiry to the premisses of the 
argument for his survival to 65 in such a way as to give that argument the 
same increment of weight independently of whether the probability of the 
argument is affected by this (as might be the case if his only hobby were 
discovered to be hang-gliding) or not affected (as might be the case if his 
only hobby were discovered to be stamp-collecting). After all, whichever the 
outcome the detachment of a probability for the argument’s conclusion is 
now protected in relation to that issue, and the primary purpose of assessing 
weight is, as we have seen, in 1, to evaluate such detachability. So though 
Keynes’s test gives us no authority for the idea, it is difficult to avoid 
supposing that a viable theory of weight should allow invariance of weight 
under changes of a premiss’s predicate within the same family: if one 
member of the family increases the weight of an argument to a particular 
conclusion, each of the others does also, and if one argument to a particular 
conclusion with premisses E,, £, ...E, has its weight increased by a certain 
additional premiss then so does every other argument (to the same 
conclusion) that differs only by replacing a predicate in one of E,, E,,..., En 
by a term co-ordinate with it. 

Secondly, if the value of the probability p(H,/E,) is known to be affected 
by the addition of E, to the premisses, a change is thereby made in the 
constraints that are known to affect the value of any probability 
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p(H,/E,&H,) where H, is inconsistent with H,, because we know 
mathematically that p(H,/E,&E,) < 1—p(H,/E,&E,). So it is reasonable 
to suppose that E, increases the weight of p(H,/E,) just as much as it 
increases that of p(H,/E,). Keynes speaks of this only in the case in which 
H, is the contradictory of H,, where he accepts that V(H/E) = V(A/E). But 
the principle at issue is a more general one, extending to families of mutually 
exclusive predicates as well as to pairs of contradictory ones. 

We see thus that the same principle is at work in regard to both the 
premisses and the conclusion of an argument. The weight of the argument is 
unaffected by the substitution of one predicate for another within the same 
family of co-ordinate but mutually exclusive predicates. 


6 DOES WEIGHT HAVE ANY LIMITING CASES? 


If a necessarily true proposition is added to an argument’s premisses 
(whatever they may be) it cannot change the probability of the argument. 
Hence it adds no weight to an argument, and if all the premisses of an 
argument are necessarily true, the argument has minimal weight. 

On the other hand, according to classical logic, if a necessarily false 
proposition is added to an argument’s premisses, it entails every proposition 
and a fortiori therefore it entails every proposition that is relevant to the 
conclusion. So on this basis, if the premisses of an argument contain a 
contradiction, the argument has maximal weight. 

If an argument has a necessarily true conclusion, it has a probability of 1 
whatever the premisses. So no additional premiss can affect its probability, 
and it must therefore be regarded as already having maximal weight. 
(Correspondingly, if H is necessarily true, we are fully entitled to detach 
p(X) = 1 from p(H/E) = 1 whatever E may be.) 

Similarly if an argument has a necessarily false conclusion, it has a 
probability of o on any premiss, and again no additional premiss can affect 
its probability. So it too has maximal weight (as, indeed, we could also show 
by combining the principle V(H/E) = V(A/E) from s with the principle 
established in the previous paragraph). 

Also, if the premisses of an argument already entail its conclusion, or if 
they already contradict its conclusion, no additional premiss can affect its 
probability. So under these conditions too the argument has maximal 
weight. 


7 IF ONE PREMISS IS OF GREATER RELEVANCE THAN 
ANOTHER, DOES IT ADD MORE WEIGHT? 


The question here—not discussed by Keynes—is this. Suppose 
p(A/E,&E,) differs more from p(H/E,) than does p(H/E,&E,). We can 
then say that E, has a relevance to p(H/E,) that is of greater extent than E, 
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has. Is the weight of p(H/E &E,) therefore greater than that of p(H/E,&E,), 
or not? There are some reasons for saying that it is, but stronger reasons for 
saying that it isn’t. 

On the one hand it seems at first sight intuitively implausible to hold that 
the probability of a male person’s surviving to age 65 has just as much weight 
as the probability that a person with a dangerous hobby will survive to age 
65. If time or other resources for enquiry were in short supply, would it not 
be much more important in assessing a life insurance premium to determine 
whether the person concerned had a dangerous hobby than to determine 
what the person’s sex was? And, in general, if we consider the kind of 
purpose for which comparisons of weight are needed, it looks at first as 
though the extent of a new premiss’s relevance ought to enter into 
comparisons of incremental weight. If we want to detach a value for the 
probability of H, we seem better off starting with a value for p(H/E,&E,) 
than with one for p(H/E,&E,) if E, is more relevant to p(H/E,) than Æ, is. 

On the other hand this way of assessing weight soon plunges into paradox, 
Suppose a set of evidential items E,,F,,...,E, 9, in regard to a hypo- 
thesised conclusion H. Suppose too that quite a lot of these items, on their 
own, ground low probabilities in favour of H, quite a lot ground high 
probabilities in favour of H, and quite a lot ground intermediate prob- 
abilities at varying levels. One way of ordering these items would be to begin 
with those highly in favour of H, then proceed with those slightly less in 
favour and so on down, ending up with those highly in favour of Ë. In sucha 
carefully graduated order the extent of the relevance of each new piece of 
evidence, after the first, would tend to be small. So if the weight of the 
argument were to be affected by the extent of the relevance of each 
incremental piece of evidence, as well as by the number of those pieces, the 
additional effect on the overall weight would be minimal. But, if instead the 
evidential premisses were ordered so as to alternate as violently as possible 
between favourable and unfavourable items, the overall effect on the weight 
would be very different, if extent of relevance was allowed to affect the issue 
at each incremental step. Hence, if we allow the extent of an added premiss’s 
relevance to affect the cumulative weight of an argument, we could end up 
with different weights for an argument to the same conclusion from logically 
equivalent premisses, just because we calculated the weights on the basis of 
different orderings for the addition of new premisses. This is certainly 
inconsistent with what was said about the substitutivity of equivalents in 4, 
and seems unacceptably paradoxical. 'The total weight of the evidence for a 
conclusion ought to be independent of the order in which different 
evidential items are stated. For some people the psychological effect may 
vary with the order of statement, but a competent reasoner ought to be able 
to discount such effects. So I conclude that in a theory developing Keynes’s 
seminal idea any weight added to an argument by a new premiss should stem 
solely from the fact that this premiss entails a relevant proposition and not at 
all from the extent of that relevance. 
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8 IS WEIGHT DETERMINED BY RELATED PROBABILITIES? 


The term ‘weight’ has been used by philosophers to cover one or other of a 
variety of probability-related measures. Thus Reichenbach ([1949], p. 465) 
held that, if we know the limit towards which the relative frequency of a 
certain kind of outcome tends within a sequence of events, then this value 
can be regarded as ‘the weight of an individual posit concerning an unknown 
element of the sequence’. Or, in other words, ‘the weight may be identified 
with the probability of the single case’. Clearly this is not the sense in which 
Keynes was using the term in chapter VI of his [1921], since he emphasised 
that an argument of high weight is not, as such, ‘more likely to be right’ than 
one of low weight. It is easy to find arguments that have high probability but 
low weight, or low probability but high weight. (Elsewhere in his [1921], 
however, Keynes sometimes uses the term ‘weight’ more loosely: e.g. 
p. 218.) A 

Again, Good [1968] has discussed weight in the sense in which the weight 
of evidence concerning H that is provided by E, given G, is equal to log 
{p(E/HG) + p(E/HG)}. But the quantity of evidence relevant to a certain 
argument is independent of the probability of the evidence given the 
conclusion. A great quantity of evidence might have been collected in a 
murder trial, with most of it tending to incriminate the accused, but it might 
also include an unshakable alibi. In such a case the evidence available might 
have relatively low probability, given the innocence of the accused, but it 
would have a heavy Keynesian weight. Similarly, that a person died before 
the age of 80 given that he died before the age of 8 has maximum probability, 
but the evidential fact that he died before the age of 80 gives a rather small 
increment of weight to any argument that he died before the age of 8. 

It is sometimes suggested that, if the probability that a person assigns to 
his belief may be quantified in terms of the odds at which he would accept a 
bet on its truth, given specified evidence, then the weight of that evidence 
may be taken to be reflected in the amount that he is prepared to bet: he may 
be expected to be willing to put a larger sum at risk when there is more 
evidence from which to estimate appropriate odds. But other considerations 
also may affect our attitude towards the size of a bet. Suppose that there is a 
great deal of evidence, and that this evidence suggests the appropriateness of 
very long odds. Would you really be willing to risk losing just as large a part 
of your fortune then as you would risk losing if the odds, on the same 
evidence, were much shorter? 

It is also sometimes suggested that the weight of an argument may be 
taken to vary inversely with the mathematical expectation of gain from a 
search for further relevant evidence. But this suggestion is open to at least 
two cogent objections. First, in order to avoid begging the question the gain 
talked about must presumably be in some non-epistemic kind of utility. And 
this raises familiar problems about the evaluation of epistemic functions by 
reference to non-epistemic criteria, as discussed in e.g. Levi’s [1984]. 
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Secondly what are we to say when, for example, a vital eye-witness has died 
without ever disclosing what he saw? The expectation of any kind of gain 
from further research in that direction may then be zero, but the weight of 
the evidence about what actually happened is not increased because of the 
missing data. This is because the weight of the evidence obtained is being 
assessed by comparison with the supposed totality of relevant facts, not with 
the supposed totality of discoverable relevant facts. So, even if we had all the 
available evidence, our argument might still not have maximal weight. 
Sometimes, for example, the prosecution cannot prove guilt beyond 
reasonable doubt even though someone must have committed the crime in 
question. 


9 CAN ONE ARGUMENT BE COMPARED FOR WEIGHT WITH 
ANOTHER IF ITS TERMS NEITHER ENTAIL NOR ARE 
CO-ORDINATE WITH THE OTHER’S TERMS AND NEITHER 
ARGUMENT IS A LIMITING-CASE? 


We have seen that arguments may be compared with one another for weight 
where certain entailment relations hold between their premisses (as in 3), or 
where their terms are co-ordinate and mutually exclusive (as in 5), and that 
weight is maximised or minimised in certain special cases (7), but that 
weight is not determined by the sizes of associated probabilities (8). The 
question now arises whether any other principles bear on the determination 
of weight. In particular, since the extent of a premiss’s relevance does not 
affect its incremental weight (6), should we treat this as a licence to suppose, 
by what I shall call ‘the principle of equipollence’, that the members of 
different families of predicates enhance the weight of an argument equally 
when they enter relevantly into its premisses? For example, in 7 we have 
seen reason to reject the view that the probability that a person with a 
dangerous hobby will survive to age 65 has greater weight than the 
probability that a male person will survive to age 65. But does it therefore 
follow that these two probabilities should be attributed the same weight? Is 
the inequality to be rejected because no comparisons of this kind are 
possible, or just because the true comparison is one of equality? I shall argue 
for the former thesis, or—more exactly—for the thesis that comparisons of 
this kind are possible only on the basis of unacceptable assumptions. 
Consider the two predicates ‘has a dangerous hobby’ and ‘has a dangerous 
hobby and a weak heart’. We obviously cannot accept (as the principle of 
equipollence would seem to require) that the weight of evidence is 
unaffected by which of these two predicates is ascribed in the evidential 
proposition, since having a weak heart is relevant to whether a person 
survives tq age 65 even on the condition that he or she is a lorry driver and 
has a dangerous hobby. So either we have to reject the principle of 
equipollence altogether, or we must restrict its application to primitive 
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predicates in some appropriately tailored language-system. But the latter 
kind of move would introduce a substantial element of linguistic convention 
into the assessment of weight. The weight of an argument would depend not 
just on facts about probabilistic relevance but also on which predicates were 
chosen as primitive and therefore as having no non-trivial entailments. If, 
for example, ‘has a dangerous hobby’ and ‘is male’ were both treated as 
primitive predicates, the probability that a person who has a dangerous 
hobby will survive to age 65 would be assigned the same weight as the 
probability that a male person will survive to age 65; but, if one of the two 
predicates were treated as primitive and the other not, the two probabilities 
would have different weights. So, unless there is reason in a particular area 
of enquiry to suppose that the primitiveness or non-primitiveness of a 
predicate is unambiguously determined by the facts rather than by 
convention, it looks as though the principle of equipollence cannot be 
rescued. 

It would certainly be very convenient if each of the primitive predicates 
that might enter relevantly into the premisses of any argument for a certain 
conclusion added an equal weight and all the relevant predicates that added 
least weight were primitive. The addition of a premiss containing any one 
such predicate could then be supposed to add one unit of weight to the 
argument. But this convenient state of affairs is most unlikely to obtain. The 
primitiveness or non-primitiveness of a predicate is a relatively a priori 
property of it that cannot usefully be made to depend on facts about 
relevance to a particular argument for a particular conclusion. There is a 
fairly obvious reason for this. An issue that has no relevance for one 
argument may have considerable relevance for another, whether the 
arguments be for the same conclusion or for different ones, and the structure 
of a language—including the primitiveness or non-primitiveness of certain 
predicates—cannot usefully be geared to one particular series of arguments 
for one particular conclusion. If it were so geared, the strategy would be self- 
defeating because the range of comparisons permissible within the language 
would be excessively narrow. 


Io CAN WEIGHT BE RANKED AS WELL AS COMPARED? 


If the principle of equipollence is indefensible, there is no natural unit of 
weight and the prospects of any non-arbitrary system for measuring weight 
are very poor. But that does not exclude the possibility of having a 
principled system for ranking it, at least in relation to arguments about a 
given subject-matter, t.e. about conclusions that involve predicates of a 
given family. Such a system would require that the weight of any two 
arguments about the given subject-matter should be comparable, that 
superiority in weight should be a transitive relation, and that a level of least 
weight should be recognisable. And we should have such a system if we had 
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an ordering for a certain set of families of evidential predicates and 
concerned ourselves only with arguments from premisses that contain just 
predicates belonging to the first family, or just those predicates plus 
predicates belonging to the second family, or just predicates from each of the 
first three families, and so on cumulatively. Moreover, when our list of such 
families was finite and determinate, we would also have a recognisable level 
of maximum weight and could tell how far short of it a particular argument 
fell. But how are predicate-families to be ordered appropriately? 

We have to bear in mind that the primary purpose of the weight-judging 
enterprise is to grade our entitlement to detach a value n for the probability 
of H, when given premisses Ẹ and p(H/£) = n. And we have to bear in mind 
also that the time or other resources available for enquiry may be limited and 
that it is desirable for the probability of H that we detach to be as close as is 
possible within those limits to H’s true probability. Indeed, because it may 
be the case that very few individuals in the population available for sampling 
possess certain complex combinations of characteristics, it may be im- 
practicable to determine a probability that has more than a certain level of 
weight. For reasons such as these it will be important to give priority in the 
ordering of predicate-families to those families that contain at least one 
predicate which is highly relevant in relation to the accepted prior 
probability of at least one conclusion in the given field. This is where the 
intuition about extent of relevance that was discounted in 6 comes into its 
own. The predicate-family containing a predicate of greater relevance to 
some conclusion in the given field than any other predicate-family contains 
should be placed first in the ordering, and so on down. But, of course, two or 
more predicate-families might tie at any stage, and if so we should either 
have to resort to some arbitrary criterion of priority in their case—say, an 
alphabetical one—or better, if we want to avoid any element of arbitrariness 
in constructing our system of ranking, we should combine such a set of two 
or more tying predicate-families into a single predicate-family that contains 
every possible combination of the predicates belonging to the tying 
predicate-families. With our predicate-families thus ordered we should not 
only be able to rank the weights of any arguments from premisses of the kind 
described in the previous paragraph: we should also be able to ensure that, 
even if the argument was not based on a conjunction of all potentially 
relevant premises and so its weight was not maximal, then at least it would 
be based on a conjunction of the most relevant premisses so that the 
probability detached was as near to the true one as it could be for that 
number of premisses. 

Moreover, if on the basis of the same ordered set of predicate-families we 
can rank evidential weight for conclusions containing predicates that belong 
to two or more different families, these weight-rankings are obviously going 
to be comparable with one another. For example, if E adds weight to the 
probability that Mr Smith will survive to 65, given that he’s a lorry-driver, 
then presumably FE adds the same weight to the probability that Mr Smith 
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will survive to 66 given that he’s a lorry-driver, even though surviving to 65 
and surviving to 66 are not incompatible with one another. 


im IS IT WORTH WHILE KNOWING THE WEIGHT OF 
AN ARGUMENT WITHOUT KNOWING ITS 
PROBABILITY? 


Keynes defines his concept of weight in such a way that it is possible to know 
the relative weight of an argument without knowing its probability. Yet the 
point of the weight of an argument, as we have seen in x above, is in order to 
be able to grade our entitlement to detach an unconditional probability from 
a conditional one. We need to know the relative value of V(H/E) in order to 
grade our entitlement to infer p(H) = n from the conjunction of Æ with 
p(A]/E) = n. So, associated with the weight of an argument from E to H, 
there is also a value n such that we are directed by p(H/E) = n towards the 
assignment of n to p(H) when E is given. The weight of the argument grades 
our entitlement to proceed in that direction. Hence, though the size of the 
weight and the value of the probability are independent of one another, we 
need to know the direction towards which the argument is headed if we are 
to be able to use our knowledge of its weight. 


IZ WHAT IS THE CONNECTION BETWEEN KEYNESIAN 
WEIGHT AND BACONIAN LEGISIMILITUDE? 


Elsewhere, in a development of Francis Bacon’s seminal ideas about 
inductive reasoning (Cohen [1970] and [1977]) I have argued for a method of 
ranking the reliability of any generalised conditional (or of any of its 
substitution-instances) within a particular field of factual enquiry by 
reference to the complexity of the controlled experiments that it survives. I 
call this ‘the method of relevant variables’; and I call the parameter that it 
grades ‘legisimilitude’ (Cohen [1980a] and [1985]) i.e. proximity to being a 
natural law. Experiments of different degrees of complexity are (if properly 
insulated from external factors) like simulations of possible worlds that 
differ from one another in the variety of combinations of inductively 
relevant circumstances that they contain, and generalisations are shown to 
possess greater and greater legisimilitude as they are shown to hold good 
over varieties of possible worlds that are more and more richly stocked with 
combinations of inductively relevant circumstances. One can show too that 
these rankings of legisimilitude constrain one another in accordance with 
the principles of a modal logic that generalise on the Lewis-Barcan system 
$4.1 It follows that universally quantified conditionals (or their 


1 There are isomorphisms also with the Levi-Shackle theory of potential surprise: see (Cohen 
[1980b], pp. 64-66 and p. 171). 
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substitution-instances) which are qualified in such a way as to withstand 
falsification by any experiment of a certain level of complexity may be 
attributed an appropriate level of legisimilitude by the same system of 
ranking. 

‘The various lines of reasoning that I have developed in the past in order to 
justify this system of ranking Baconian legisimilitude and the lines of 
reasoning developed in 3—9 above in regard to the comparing or ranking of 
Keynesian weight are quite independent of one another. The Baconian 
scheme was defended in a context of concern with generalisations that are 
rooted in causality and it treats of generalisations about probabilities only as 
a special case. The Keynesian scheme has been defended in a context of 
concern with probabilistic relevance and only a limiting case of it (where the 
probability involved is 1) instantiates deterministic generalisation. But both 
lines of reasoning converge on precisely the same underlying structure. The 
method of ranking weight that was defended in g—x0 is an application of the 
method of relevant variables, and the various logical constraints on the 
comparison or assignment of weight that were defended in 3-6 are all 
derivable within the logical syntax of legisimilitude.1 More exactly, on a 
proper reckoning the Keynesian weight of p(H/E), where p(H/E) =n, 
should turn out equal to the Baconian legisimilitude of E > p(H) = n. 

The importance of this fact seems to me to be that, even if your intuitions 
about causality are insufficiently powerful to drive you down the Baconian 
road, you may weil find that your intuitions about probability will drive you 
down the Keynesian road to the same destination as that to which the 
Baconian road would have led you. Moreover it should be clearer now how 
Baconian (i.e. weight-orientated) modes of reasoning are not intrinsically in 
any kind of conflict with probabilistic ones but can serve to complement 
them. 

Finally it is worth considering Keynes’s theory of weight in relation to his 
proposal of a probabilistic mode of assessment for Baconian induction, i.e. 
for inductive support that depends on the variety of relevant evidence. 
When Keynes talked about such induction in Part III of his [1921] he had in 
mind the supporting of deterministic generalisations like ‘All A are B’, 
which he calls ‘universal inductions’, rather than the supporting of 
probabilistic ones like ‘All A have an n per cent probability of being B’, 
which he calls ‘inductive correlations’ or ‘statistical inductions’. If he had 
thought more about the latter in connection with his theory of weight, he 
could have used it to provide an appropriate mode of assessment for their 
support and he might then have sought to extend the theory of weight so as 
to cover deterministic generalisations also and produce a general theory of 
Baconian legisimilitude. But in fact when he briefly ([1921], pp. 409-12) has 


1 The development of this logical syntax was begun in Cohen ([1970], pp. 216-37) and 
continued in Cohen ([1977], pp. 229-40): compare theorem-schema 710 there with 3 here, 
248 with 4, 357 and 613 with g, and 703, 704, 707 and 728 with 6. 
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regard to considerations of weight in connection with statistical inductions 
he does not use the term ‘weight’ at all or refer back to his earlier discussion 
of the subject.? 


The Queen’s College, Oxford University 
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Mathematical Alchemy* 


by PENELOPE MADDY 


x1 Kripke on the Private Language Argument 
2 Wittgenstein on Mathematics 
3 Math, Science and Sensation Talk 


In the PI, Wittgenstein emphasizes that 


An investigation is possible in connexion with mathematics which is entirely 
analogous to our investigation of psychology. (PI, p. 232) 


Of course the famous private language argument is the centre piece of that 
‘investigation of psychology’. On Kripke’s reading,* we are to think of the 
rule-following argument as the source of both the private language 
argument and Wittgenstein’s account of mathematics in the RFM. In crude 
outline, the rule-following argument undermines all traditional semantic 
theories, that is, those according to which what we learn when we learn a 
linguistic expression predetermines (at least to a large extent) what is to 
count in the future as correct and incorrect usage. This generates a ‘sceptical 
paradox’ to the effect that we can’t mean anything by what we say. 
Wittgenstein’s solution to the paradox, according to Kripke, is his non- 
traditional, criteriological approach to linguistic meaning. Finally, it is this 
criteriological analysis that shows private language to be impossible. 
Combining this Kripkean diagram of the private language argument with 
Wittgenstein’s own insistence that mathematics be treated parallel to 
psychology, we should expect Wittgenstein’s late views on mathematics to 
be generated by the application of criteriological considerations to the 
special case of mathematics. 

Against this background, I find it alarming to note that Wittgenstein held 
mathematical realism—the view that mathematics is the science of a mind- 
independent reality—to be no better than alchemy: 


Received 18 February 1985 


* The account of Wittgenstein’s views offered here is developed from Kripke’s in [1982], so 
perhaps I should follow Putnam in referring to the philosopher under discussion here as 
Kripkenstein. I am grateful to the AAUW, the University of Notre Dame, and the University 
of Illinois at Chicago for partial support of this research, and to John Burgeas, Hartry Field, 
Dorothy Grover, Phillip Kitcher, Carolyn McMullen, Michael Resnik, Donna Summerfield, 
and Mark Wilson for their helpful comments on earlier drafts. References to Wittgensteinian 
writings in the text use initials rather than dates; parenthetical numbers refer to paragraphs 
within sections. . 

1 Kripke [1982]. 
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‘The comparison with alchemy seems natural. We might speak of a kind of alchemy in 
mathematics. Is it already mathematical alchemy that mathematical propositions are 
regarded as statements about mathematical objects,—and so mathematics as the 
exploration of these objects? (RFM V 16) 


His later writings are littered with remarks of this stripe. Consider: 


When we talk about the sense of mathematical propositions, or what they are about, 
we are using a false picture ... mathematics . . . isn’t about anything. (PG p. 260) 


Arithmetic as the natural history (mineralogy) of numbers. But who talks like this 
about it? Our whole thinking is penetrated with this idea. (RFM IV 11) 


One would like to say of [a mathematical proposition] e.g.: it introduces us to the 
mysteries of the mathematical world. This is the aspect against which I want to give a 
warning. When it looks as if... , we should look out. (RFM II 40-41) 


... even if the proved mathematical proposition seems to point to a reality outside 
itself, stil it is only the expression of the acceptance of a new measure (of reality). 
(RFM III 27) 


There can be no doubt that Wittgenstein thought all forms of mathematical 
realism to be fundamentally misguided. Now if these anti-realist conclu- 
sions for mathematics follow as directly from the rule-following argument 
and criteriological considerations as does the impossibility of a private 
language, it would be as foolhearty for contemporary philosophers of 
mathematics to ignore them as for contemporary philosophers of mind to 
ignore the private language argument. As mathematical realism is enjoying 
something of a renaissance these days,‘ an assessment of the structure of 
Wittgenstein’s objections to it is imperative. 

I will approach this question in three steps. First, I will sketch, for the 
record, Kripke’s version of the private language argument. Then, in section 
II, the longest section, I will trace the development of Wittgenstein’s late 
view of mathematics. Finally, in the concluding section, I will argue that 
Wittgenstein’s anti-realist conclusions for mathematics do not in fact follow 
(as the private language argument may) from rule-following and criterio- 
logical considerations alone. 


I KRIPKE.ON THE PRIVATE LANGUAGE ARGUMENT 


According to Kripke, the sceptical conclusion of the rule-following 
argument is contained in PI 201: 


This was our paradox: no course of action could be determined by a rule, because 
every course of action can be made out to accord with the rule. The answer was: if 
everything can be made out to accord with the rule, then it can also be made out to 
conflict with it. And so there would be neither accord nor conflict here. 


1 See, for example, Steiner [1975], Resnik [1981-82], Kitcher [1983], and my [1980] and 
[1981]. (My own view is loosely based on Godel [1944] and [1947/64].) 


Mathematical Alchemy 281 


I won’t summarize Kripke’s account of the argument for this conclusion 
because it is well stated in his book. What follows here will be a sketch of 
(what Kripke sees as) its consequences for traditional semantics and the 
criteriological alternative. 

At stake here is the relation between what we learn when we learn a word 
and our subsequent use of that word. There have been various accounts of 
this relation. For example, a Fregean might say that when we learn a word, 
we grasp its sense, that the sense determines the reference, and that we then 
use the word to refer to its referent. A verificationist might say that we learn 
the word’s verification conditions, and that we then use the word when those 
conditions obtain. Wittgenstein himself, in his BIBR stage, is inclined to say 
that when we learn a word, we grasp something or other which determines 
our future usage. In any case, all three of these accounts can be described as 
asserting that when we learn a word we learn some finite rule which 
predetermines our future usage. But the rule-following argument purports 
to show that rules do no such thing: 


If it is true that you understand a symbol now, and that this means you can apply it 
properly—then, one is inclined to say, you must have the whole application in your 
mind. It may be all in your mind: for example, a complete diagram, or a page with 
rules... . But suppose we had the pages of rules in our minds, would this guarantee 
that we both applied them alike? You may say, “No, he may apply them differently”. 
Whatever goes on in his mind at a particular moment does not guarantee that he will 
apply the word in a certain way in three minutes’ time. Should we then say that a man 
can never know whether he understands a word? If we say this, where shall we stop? 
We can’t even say, “We will know it as time goes on.” Suppose there were six uses of 
the word “‘house”’, and I used it correctly in each of the six ways; is it clear I will use it 
correctly the next time? (LFM, p. 23. See also pp. 25, 28.) 


Thus the rule-following argument, if correct, is ruinous not only for 
Fregean or Tractatian realistic semantics, but also for verificationist 
semantics, and ultimately, for any theory that involves our learning 
meanings which arbitrate correct and incorrect future usage.” Any such 
meanings are generalized rules, and every application of a rule requires a 
fresh decision. Thus the rule-following argument shows that there are no 
meanings which determine the correctness or incorrectness of our usage, yet 
this seems to yield the unpalatable conclusion that all language is meaning- 
less. For Wittgenstein, this paradox was the impetus for a total rejection of 
traditional theories of meaning and the substitution of a new approach to 


language. 
Kripke’s baldest statement of this new approach is this: 


t I sketched a version of the rule-following argument in my [1984]. Though I accept the 
soundness of the rule-following argument for the sake of argument in the text, the paper cited 
contains an attempt at a rebuttal. 

2 This last is sometimes misunderstood. See Hacker [1972], p. 104, and the references cited 
there. As a matter of terminology, semantic theories according to which something or other 
predetermines the correctness or incorrectness of future uses I will call ‘traditional’. 
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Wittgenstein replaces the question, ‘What must be the case for this sentence to be 
true?” by two others: first, “Under what conditions may this form of words be 
appropriately asserted (or denied)?”’; second, given an answer to the first question, 
‘What is the role, and the utility, in our lives of our practice of asserting (or denying) 
the form of words under these conditions?” S 


This oversimplified picture ignores several points that Kripke brings out. 
First, the term ‘assertability conditions’ implies too great an emphasis on 
declarative sentences; a more Wittgensteinian idiom would be ‘conditions 
under which a move is appropriate in the given language game’. Second, the 
appeal to assertability conditions should not be taken for yet another 
traditional semantic theory along contemporary anti-realist lines. By now 
the rule following argument has dashed any lingering hope that correct and 
incorrect future usage can be legislated by ‘meanings’ of any sort. Finally, 
the emphasis on an expression’s role (as well as the rejection of traditional 
aims) separates Wittgenstein’s new approach from various versions of 
verificationism. 

To see what all this comes to, let’s follow Wittgenstein’s advice in a 
particular case. Suppose I point to the table in the corner of the room and say 
‘That is a table’. Is this expression meaningful? Presumably the answer is 
‘yes’, but how does this follow from Wittgenstein’s account? 

First, we must inquire into assertability conditions. Under what con- 
ditions am I justified in asserting “That is a table’? Assuming the conclusion 
of the rule-following argument, the answer to this question cannot be 
anything along the lines of my having previously adopted the rule that the 
word ‘table’ applies only to tables. Any use of the word ‘table’ could be made 
out to be in accordance with, or to violate, this rule. Under these conditions, 
my only justification for using the word as I do is that my linguistic training 
has disposed me to do so; I feel inclined to use the word this way. 

This answer will hardly do. If it were correct, there would be no basis for 
your insisting that I am wrong when I point to my cat and say “That is a 
table’. When we look beyond this suggested assertability condition to the 
wider role of the expression in question, we see that the language game in 
which our talk of tables is embedded does allow for situations in which you 
would be justified in calling me wrong. So we must be mistaken about the 
assertability condition. 

Let’s try again. Apparently, you and I both have our primitive incli- 
nations to use the expression “That is a table’ in various situations and to 
withhold it in others. If we disagree, there seems to be no basis for 
determining which of us is correct. At this point, we must appeal to the 
linguistic community of which you and I are members. If I agree with most 
of them, then we are correct in saying that you are wrong. If the pattern is 
reversed, I am wrong (see PI 206). Thus there are two sets of assertability 

„conditions: our primitive inclinations, and the standard inclinations of most 


! See Kripke [1982], p. 73. 
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members of the community. If these didn’t agree most of the time, our 
language would collapse (PI 240). 

In the case under consideration, the standard inclination of the com- 
munity is to say “That is a table’ only in the presence of tables. We judge the 
correctness or incorrectness of someone’s use of the expression against this 
public standard. If there were no such public standard, there would be no 
difference between my thinking I was using the expression correctly (my 
primitive inclination), and my actually using the expression correctly (see 
PI 202). In Wittgensteinian terminology, such a public standard is called a 
‘criterion’. 

So this is the criteriological approach; let’s now see how it applies to the 
case of private language. The fulcrum of the argument is the specification of 
what is to count as private. Wittgenstein sets to work on this problem at the 
outset of his discussion: 


But could we also imagine a language in which a person could write down or give 
vocal expression to his inner experiences—his feelings, moods, and the rest—for his 
private use?— Well, can’t we do so in our ordinary language?—But that is not what I 
mean. The individual words of this language are to refer to what can only be known 
to the person speaking; to his immediate private sensations. So another person 
cannot understand the language. (PI 243) 


This means that the sensations words of this language are not connected 
with any characteristic expressions of sensation; if they were, someone else 
could understand them. For the language to be truly private, we must 


. suppose I didn’t have my natural expression of sensation, but only had the 
sensation. ... And now I simply associate names with sensations and use these names 
in descriptions. (PI 256) 


‘This association is to be set up by some form of private ostensive definition, 
a concentration of attention on the sensation (PI 258). 

On the criteriological account of language, it is not hard to see why a 
private language in this sense is impossible. This turning of attention on the 
sensation is supposed to predetermine correct and incorrect uses of the 
newly introduced name. But just as in our first, now-discarded attempt to 
specify assertability conditions for ‘table’, we find that this purported 
definition provides no distinction between correct uses of the term and uses 
which seem correct to me: 


. . whatever is going to seem right to me is right. And that only means that here we 
can’t talk about ‘right’. (PI 258; see also 259 and 269) 


In other words, no amount of inward pointing can provide public criteria for 
correct use of the newly introduced name. A term without criteria is 
meaningless, so there can be no private language. 

This cannot mean that all our talk of sensations is meaningless; in this 
context, Wittgenstein will not fly so blatantly in the face of common sense. 
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Instead, his conclusion is that our ordinary language of sensation is not 
private: 


How do words refer to sensations? There doesn’t seem to be any problem here; 
don’t we talk about sensations every day, and give them names? But how is the 
connection between the name and the thing named set up? .. . words are connected 
with the primitive, the natural, expressions of the sensation and used in their place. 


(PI 244) 


Various characteristic behaviour patterns (wincing, groaning, etc.) serve as 
criteria for our use of the word ‘pain’, for example. We count a person as 
understanding the word ‘pain’ if he uses it on occasions when he is naturally 
inclined to cry. 

This account of sensation talk needs one final bit of fine-tuning. It sounds 
as if the sensation word ‘means’ or ‘stands for’ the characteristic behaviour, 
which leads to the behaviouristic notion that mental terms really refer to 
complex behaviour patterns. Wittgenstein’s opponent raises the standard 
objection: 


“But you surely will admit that there is a difference between pain-behavior 
accompanied by pain and pain-behavior without any pain?’’—Admit it? What 
greater difference could there be?—‘‘And yet you again and again reach the 
conclusion that the sensation itself is a nothing.” (PI 304; see also 307) 


The mistake here, according to Wittgenstein, is in supposing that all 
language serves the same purpose: 


The paradox disappears only if we make a radical break with the idea that language 
always functions in one way, always serves the same purpose: to convey thoughts— 
which may be about houses, pains, good and evil, or anything else you please. (PI 


304) 


Many expressions, like ‘table’, do function descriptively; their referents can 
be picked out by pointing (PI 669); their criteria are very close to what a 
Tarskian semanticist would think of as truth conditions.’ The trouble arises 
when we see the close connection of sensation talk with characteristic 
behaviour, assume that all language serves the purpose of description, and 
conclude that sensation talk describes behaviour. In fact, 


. . . the verbal expression of pain replaces crying and doesn’t describe it. (PI 244) 
(See also PI 290, 317.) 


2 WITTGENSTEIN ON MATHEMATICS 


Given this Kripkean account of how the private language argument follows 
from the rule-following argument, I now turn to an exposition of 
Wittgenstein’s late philosophy of mathematics. The question of the extent 


1 See Kripke [1982], pp. 99, 105. 
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to which these views can be defended along lines parallel to the attack on 
private language will be postponed until the next section. 

Wittgenstein’s views on mathematics after the acceptance of rule- 
following argument and the development of the criteriological approach are 
of central interest here. Oddly enough, though, many of these late 
Wittgensteinian views are already present during the transitional period of 
the PR (1929-1930) and the PG (1932—1934) which predate the rejection of 
traditional semantics. T'o some extent, the positions on mathematics remain 
constant while the grounds on which they are based shift. I hope that tracing 
this development will shed some welcome light on the RFM (1937-1944). 


Mathematics in the Philosophical Remarks 


The most developed view of mathematics in the PR generalizes the 
verificationism of that book to cover the case of mathematics: the sense of a 
mathematical proposition is given by its proof (PR 122(4), 154(4), r89(3)).! 

This verificationist position confronts a number of serious obstacles. 
Their source can be understood by means of a comparison. Any view of 
mathematics that stresses proof over truth conditions calls intuitionism to 
mind, but philosophical intuitionism has taken several forms. Many 
contemporary philosophers working on intuitionism prefer a semantic 
reading: the classical mathematician’s Tarskian semantics for mathematical 
statements is replaced by some form of verificationism.? Most commonly, 
the classical mathematician’s statement is taken to mean that there is a 
certain construction rather than that various mathematical entities have 
certain properties. 

Wittgenstein’s verificationism is more radical: the sense of a statement ts 
what would verify it, not the claim that such a verification exists. In the case 
of a phenomenal proposition, the sense is as it were a description of a certain 
experience; to judge the truth or falsity of the proposition, we compare its 
sense with reality to see if that possible experience occurs. In the 
mathematical case, the sense of a statement is a description of its proof, not 
the assertion that it has a proof or that a proof has been carried out (PR 166). 
Here’s the rub. If a mathematical statement has sense, there is a complete 
description of its proof. But a complete description of a proof is a proof, so no 
further comparison with reality is needed to assess the statement’s truth 
value. More to the point, it can have only one truth value: true. The problem 


1 Two other views are considered in the PR. First, there is the idea that mathematical 
statements try to express what is necessary, hence inexpressible (PR 116(6), 119(3-5), 
200(5)). This is a hold-over from the Tractatus, and vanishes with the rejection of necessity in 
the PG. Second, Wittgenstein suggests that mathematical propositions are contingent, 
contentless stipulations (PR 121(1), 160(2), 202(5)) used only in applications to real 
propositions (PR 107(1), 114(6)). This way of thinking is much more at home in the context of 
the PG with its arbitrary grammar. See below. 

2 See Dummett [1973]. 
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is the old one the picture theory was designed to solve: how can a proposition 
be both meaningful and false? 

This too close connection between meaning and truth value has some 
striking consequences (PR 148-151). Strictly speaking, ‘false’ statements 
like 1 =0, unproven statements like Fermat’s Last Theorem, and all 
mathematical conjectures are meaningless (PR 148). Indeed, a mathema- 
tician shouldn’t be able to set out to prove an unproved proposition because 
it would be nonsense to begin with. Furthermore, even if he does succeed in 
‘proving’ it, he won’t have proved what he set out to prove (that is senseless), 
but a new proposition with a new sense (his proof). (See PR 155(4-6).) 

Wittgenstein is sensitive to the counterintuitiveness of these con- 
sequences; he writes, ‘my explanation mustn’t wipe out the existence of 
mathematical problems’ (PR 148(3)). He makes some brave efforts to head 
off the problem of meaning and truth value by loosening the requirements 
on sensicality (PR 148-151). At various points, he introduces the notions of 
a proof of relevance (‘. . . a proof which, while yet not proving the 
proposition would show the form of a method to be followed in order to test 
such a proposition’ (PR 149(2))), a general method of solution (PR 117(2, 3), 
149(6, 7), 150(3), 174(5)), and a systematic search (PR 150(15, 16)).! But 
even these weakened demands are too strong: 


Wouldn’t all this lead to the paradox that there are no difficult problems in 
mathematics, since, if anything is difficult, it isn’t a problem? (PR 151(11)) 


To combat this, Wittgenstein tries to loosen the requirements on sensicality 
even further. In difficult cases, he suggests, though the mathematician lacks 
a written symbolism, he enjoys ‘some sort of psychic symbolism, in images, 
“in his head” ’ (PR 151(12)). 

The only sort of mathematical problem still without sense on this account 
is one which requires a simple, unsystematic search for a solution. 
Wittgenstein disallows this on the grounds that it would be synthetic and 
any verification procedure in mathematics must be analytic (PR 151(12)). 
Thus, even if we are somewhat unreasonably generous about the unclear 
and ad hoc loosenings of Wittgenstein’s verificationist requirements, the 
more difficult open problems—those for which no general method of 
solution is either known or ‘present psychically’— will remain senseless. We 
can only conclude that Wittgenstein’s peculiar verificationist semantics for 
mathematics is seriously flawed by its inability to distinguish sufficiently 
between sense and truth value. 

Two other aspects of Wittgenstein’s verificationist position in the PR are 
important for understanding the RFM: the relationship between mathe- 
matics and natural science, and the treatment of infinity. On the first of 


1 Another possible move, based on embryonic rule-following considerations, is mentioned 
briefly in the PR (see 149(1)), but not followed up. I’ll mention it below in the context of the 
later work. 
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these, recall that meaningful mathematical and phenomenal statements are 
seen as parallel: a calculation or proof! verifies the one just as an experience 
verifies the other. In contrast, scientific statements, even simple physical 
object statements, aren’t propositions at all, but hypotheses, adopted for 
simplicity, that can’t be conclusively verified (PR 225-230). This sounds at 
first like the commonplace that we can never be certain of the truth of a 
physical object statement, but in the context of verificationist semantics, the 
open-endedness of the potential evidence means that a physical object 
statement lacks determinate sense, and hence, lacks truth value (PR 228(2)). 

Still, Wittgenstein insists that such statements bear an important formal 
relation to reality because experience can confirm or disconfirm them (PR 
225(3), 227(3)). Why shouldn’t the situation be analogous for mathematical 
statements that can’t be directly verified by calculation or conclusive proof? 
For example, why shouldn’t we say that the Fundamental Theorem of 
Calculus was confirmed by various geometric plausibility arguments before 
it was finally proved from the newly developed theories of limits and real 
numbers? Or that the determinacy of Borel sets was confirmed by its proof 
from the axiom of measurable cardinals, then further confirmed by its later 
proof from the axioms of Zermelo-Fraenkel alone?? 

Wittgenstein rejects this idea because it runs counter to an important 
Wittgensteinian thesis: 


We divide the evidence for the occurrence of a physical event into the various kinds 
of such evidence, into heard, seen, measured, etc .. . there can’t be two independent 
proofs of one mathematical proposition. (PR 119(1)*) 


And why can’t there be two independent proofs of a single mathematical 
proposition? Because the confirmation of a physical object statement 


... requires grounds from the outside, whereas a mathematical proof is an analysis of 
the mathematical proposition. (PR 153(1); see also 162(6)) 


So the strong disanalogy between mathematics and natural science rests on a 
new counterintuitive consequence of Wittgenstein’s views: no mathematical 
proposition can have two distinct proofs. And this, like the meaninglessness 
of unproved mathematical statements, is partly supported by the purported 
analyticity of mathematics. 

On the important issue of infinity in mathematics, Wittgenstein’s position 
in the PR is uncompromising: nothing can be said about all members of an 
infinite collection; indeed, there is no such thing as all members of such a 
collection (PR 126(1, 2)). Two arguments are given for this claim. The first 


1 It is clear that Wittgenstein considers calculations and proofs together in their verifying roles; 
he discusses now one, now the other, without any indication that significant differences exist 
between them. I will follow him in this. For stylistic purposes, I might omit mention of one or 
the other in particular contexts, but nothing substantive should be deduced from this. 

2 See Martin [1970] and [1975]. 

3 See also PR 155(8), and Moore [1959], pp. 305-306. 
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rests directly on verificationism: a proposition purportedly about an infinite 
collection cannot be verified, so it cannot have sense (PR 123). Of course, 
scientific hypotheses can have an infinite range of partial verifications, but 
this option is closed to mathematical statements by the analyticity-based 
argument of preceding paragraphs. 

The second argument against infinity rests on the idea that a mathematical 
statement must have the same complexity as what it describes (PR 118(4), 
166(4)). Thus a proposition about infinity would have to be infinitely 
complex, which is impossible (PR 127). As far as I can tell, the only non- 
question-begging supports for this equicomplexity claim depend on ele- 
ments of the defunct picture theory to which Wittgenstein is no longer 
entitled, or upon odd assumptions about the natural numbers. Let me 
sketch the latter. 

Though we can’t symbolize infinity, Wittgenstein does think that we can 
imagine infinite sequences; what he denies is that these are composed of 
simple objects: 


... that which we can imagine multiplied to infinity [is] never the things themselves, 
but combinations of things in accordance with their infinite possibilities, (PR 147(8)) 


(See also PR 107(4-8).) What is infinite about the natural numbers is the 
rules for the generation of numerals. The actual numerals are limited by the 
available writing materials and so on, but these limitations are not present in 
the rules (PR 141(1, 2)). 

One might, then, try to say something about all natural numbers by means 
of these rules, but this, according to Wittgenstein, is futile. Each application 
of the successor function, that is, each natural number, has its own 
‘irreducible individuality’ (PR 125(3, 6)): 


... the properties of a particular number cannot be foreseen. You can only see them 
when you’ve got there. (PR 125(4)) 


Thus any successful attempt to describe the infinite sequence of all natural 
numbers by a law would have to involve an infinitely complicated law, which 
is to say, ‘no law at all’ (PR 125(8)). There is no way to say something about a 
particular natural number without bringing in that number itself (PR 
125(3)), and no way to speak of all natural numbers without bringing in the 
entire ‘collection of structures’ (PR 126(2)) constituted by the repetition of 
successor. In short, to speak of infinity would require infinitely complex 
notation. ‘Form cannot be described: it can only be presented’ (PR 171(11)). 

One consequence of this view of infinity is that all of modern infinitary set 
theory is illegitimate. Infinity is in the unrestricted rules of the symbolism. 
Some of these unrestricted symbolic possibilities are actual, some only 
possible (because of material limitations), but mathematics doesn’t dis- 
tinguish between these (PR 144(5, 6)). Set theory, however, tries to discuss 
all these possibilities explicitly as if they were actual (PR 144(7)). This error 
shows itself in the set theorist’s assumption that infinite sets can be 
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described in finite notation, rather than represented (as is impossible) in 
infinite notation. In Wittgenstein’s words, 


The theory of aggregates attempts to grasp the infinite at a more general level than a 
theory of rules. It says that you can’t grasp the actual infinite by means of 
arithmetical symbolism at all and that therefore it can only be described and not 
represented. The description would encompass it in something like the way in which 
you carry a number of things that you can’t hold in your hands by packing them ina 
box. They are then invisible but we still know we are carrying them (so to speak, 
indirectly). The theory of aggregates buys a pig in a poke. Let the infinite 
accommodate itself in this box as best it can. (PR 170(1)) 

‘Set theory is wrong because it apparently presupposes a symbolism which doesn’t 
exist instead of one that does exist (is alone possible). It builds on a fictitious 
symbolism, therefore on nonsense. (PR 174(1)) 


Because set theory is based on this fundamental confusion, Wittgenstein 
suggests that ‘we ought to cut it down to size’ (PR 144(7)). It seems clear that 
very little would be left after such an operation. The most basic techniques 
of comparing cardinal numbers are rejected in the PR: the one-to-one 
correlation of the natural numbers with the even natural numbers to show 
the infinity of the set of natural numbers and its equi-numerousity with the 
evens (PR 141(3-5)), and the very concept of non-denumerability (PR 
174(3)). Needless to say, a blanket prohibition on set theory and its methods 
would severely alter the face of modern mathematics.* 


Mathematics in the Philosophical Grammar 


The simple verificationism of the PR is rejected immediately in the PG on 
the grounds that it ties sensicality too closely to truth (PG, p. 366(1)), and 
Wittgenstein now speaks of proofs of relevance (PG, pp. 299—302), general 
methods (PG, p. 349(3, 4)), and methods of checking (PG, pp. 366(1, 3), 
379(5), 392(1)). The first two are as unsatisfactorily vague as their 
counterparts in the PR, but the third and most prominent is developed on a 
close analogy with arithmetical calculation. An unproved statement will 
have sense, then, if there is a standard method of checking for its truth. This 
eventually leads Wittgenstein to the startling conclusion that ‘Mathematics 
consists entirely of calculations’ (PG, p. 468(4)). But even this bold 
expedient doesn’t seem to forestall the unwelcome consequences noted in 
the PR: the proposition proved is no longer the proposition attempted (PG, 
pP. 371(1), 374(6), 378(4)); mathematical conjectures are not problems in any 


1 The PR ban on completed infinities also leads Wittgenstein to identify real numbers with laws 
rather than extensions (see PR, chapters XVII-XVIII). This produces a number of quite 
intuitionistic sounding pronouncements, eg. ‘Is it possible to prove a greater than b without 
being about to prove at which place the difference will come to light? I think not.’ (195(1)). 
(See also. REM V 9-13, 17-23, 25-26; VII 41.) I find it hard to avoid the conclusion that 
Wittgenstein’s position at this time entailed some intuitionist-style reform of analysis (see 
also PG, pp. 471-485), but I'll stick to set theory where he is more explicit. 
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usual sense (PG, p. 380(1, 2)); no mathematical proposition can have more 
than one proof (PG, p. 360(2, 3), 373(4), 375(1)). The arguments here are 
the same as in the PR, depending both on (loosened) verificationism and the 
analyticity of mathematical statements. 

The position on infinity also carries over from the PR. Talk of infinity is 
rejected because there is no infinite checking procedure (PG, p. 451(3), 473(4)), 
and because it would require infinite notation (PG, p. 432(1), 468(1, 2)). 
And, also as before, set theory is rejected (PG, p. 468(6), 469(3, 4), 
286(3), p. 287(1)). Here Wittgenstein explicitly recommends that we ‘prune 
mathematics’, assuring us that 


Philosophical clarity will have the same effect on the growth of mathematics as 
sunlight has on the growth of potato shoots. (In a dark cellar they grow yards long.) 
(PG, p. 381(5, 6)) 


In sum, then, the details of the verificationist position in the PG have 
changed little since the PR except in the emphasis on methods of checking 
and the increased tendency to identify all mathematics with calculation. 

As mentioned earlier (see note 7), the PR contains another view of 
mathematics: 


. . equations are rules of syntax .. . [this] makes intelligible the attempts of the 
formalists to see mathematics as a game with signs. (PR 121(1;2)) 


Even though syntax in the PR reflects necessary truth (PR 54(8)), 
Wittgenstein seems to have a version of conventionalism in mind here: 


A mathematical proposition can only be either a stipulation or a result worked out 
from stipulations in accordance with a definite method. (PR 202(5)) 


In the PG, where all grammar is arbitrary (PG 138(1)), X), the formalistic 
picture of mathematics as a conventional game with symbols analogous to 
the game of chess with chess pieces is given a fuller statement in a more 
congenial context (PG, pp. 289-295). Here winning and losing in chess set 
up an arbitrary polarity analogous to truth and falsity in arithmetic (PG, pp. 
293(3, 6), 294(4, 5)). 

A standard objection to game formalism is that it leaves mathematical 
statements senseless. Wittgenstein ponders the significance of this denial of 
sense. If it is only meant to distinguish mathematical propositions from 
empirical ones, he is in favour (PG, p. 289(z, 7)), but his support runs deeper 
than that. Recall Wittgenstein’s insistence that mathematical statements 
must have the same multiplicity as what they represent. For him, the infinity 
in arithmetic is in the infinite possibilities of the symbolism; the two cannot 
be divorced. Thus it should come as no surprise to find him criticizing the 
idea of mathematical sense on the grounds that it diminishes the importance 
of the particular mathematical symbolism (PG, p. 290(3)). (If the same sense 
can be expressed in various different ways, the particular symbolism loses its 
importance.) In the final analysis, he takes the debate between senseless and 
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contentful mathematics to be a misguided one, like that between idealism 
and realism, soon to be obsolete (PG, p. 293(1)). 

What are we to make of this? In the PR, where sense or grammar reflect 
necessities, a switch from a verificationist theory of sense to a game picture 
with its arbitrary rules and conventional truth is a significant one. But in the 
PG, the contrast is not so stark. Here the sense of a mathematical statement 
on the first view is some loosened version of its verification conditions; these 
conditions, the statement’s grammar, are conventional, as are the determi- 
nations of truth and falsity that depend on them. To this must be added the 
strong dependence of mathematics on its symbolism. So what is the 
difference between this version of verificationism and game formalism? On 
both views, a mathematical statement has a determinate, conventional role; 
the only difference is that the first calls this the statement’s sense, while the 
second insists the statement is senseless. No wonder Wittgenstein finds the 
difference insignificant.’ 


Mathematics in the Remarks on the Foundations of Mathematics 


Between the PG and the RFM comes the rule-following argument and the 
criteriological alternative to traditional semantics. We’ve seen the con- 
sequences Wittgenstein draws for private language; we turn now to the 
effects of the criteriological approach on the philosophy of mathematics. 
The new position is a more subtle and in some ways more radical version of 
the ideas of the transitional period. 

The first step is to ask after the assertability conditions of mathematical 
propositions. From the perspective of the isolated mathematician, several 
answers can be given. For a simple statement like “There are two towers in 
the World Trade Center’, the assertability conditions involve counting or 
the speaker’s inner sense of recognizing the number without counting. 
Assertion of numerical relations like ‘365 x74 = 27010’ usually require 
some form of calculation, and advanced mathematical propositions like 
“There are infinitely many primes’ demand proof. 

A realist with a traditional semantics might claim that these assertability 
conditions are ultimately justified by their connection with the truth 
conditions of these statements. ‘Two’ refers to a certain numerical property, 
and the set that presents itself when our speaker looks at the Trade Center 
has that property. Similarly, ‘365’, “74? and ‘27010’ refer to number 
properties, ‘365+ 74 = 27010’ expresses a relation that holds between them, 
and calculation is a correct technique for producing true statements. 
Finally, proof is a sound procedure for moving from truths to truths. But 
any account along these lines presupposes determinate reference relation in 


1 This muddle between the role of the mathematical proposition giving its sense, and: the 
proposition having only role, not sense, carries over into the RFM. As the difference in is 
insignificant, I won’t continue trying to keep them straight. Poe 
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its treatment of truth. As we are assuming the correctness of the rule- 
following argument, moves of this sort are not available. 

Weare left, as in the previous section, with the seemingly insurmountable 
problem of distinguishing between cases in which our isolated mathema- 
tician has a dependable inner conviction, a correct calculation or a sound 
proof procedure, and those in which he merely thinks he has one of these. 
Or, as we’ve put the matter before, what could justify an independent claim 
that he has gone wrong? Once again, we must fall back onto public standards 
and community agreement. The mathematician’s inner conviction is only 
correct if it agrees with the inner convictions of most of his fellow 
mathematicians: his counting and his calculations are correct if they are like 
ours; his proofs are sound if his colleagues also find them so. In other words, 
the public criteria for his counting correctly is his counting as we do, for his 
calculating correctly is his calculating as we do, for his proving correctly is 
his proving as we do. If he conforms to these norms, he is admitted to the 
mathematical community. (See, for example, RFM I 2, 61, 66, 112, 116; VI 
39, 41.) 

These conclusions produce a novel picture of the nature of mathematical 
practice and the role of proof. The backbone of a proof is its logical 
structure; this structure is supposed to guarantee that if the premises of the 
argument are true, then the conclusion is also. But logical inference, for 
Wittgenstein, is just rule-following behaviour of a special sort; which 
actions accord with the rule is determined by public criteria and community 
agreement, not by some ‘truth’ of the matter: 


. . . the reason [the logical inferences] are not brought in question is not that they 
‘certainly correspond to the truth’—or something of the sort, —no, it is just this that 
is called ‘thinking’, ‘speaking’, ‘inferring’, ‘arguing’. There is not any question at all 
here of some correspondence between what is said and reality; rather is logic 
antecedent to any such correspondence; in the same sense, that is, as that in which the 
establishment of a method of measurement is antecedent to the correctness or 
incorrectness of a statement of length. (RFM I 156) 


As a community, then, we are free to infer in any way we like without coming 
into conflict with any reality. It isn’t that we must infer f(a) from (x)f(x) 
because of the meaning of the later; rather, we learn the meaning of (x)f(x), 
in part, by learning that f(a) follows from it (RFM I 10). 

What goes for logical inference in general naturally applies to mathema- 
tical proofs in particular. The proof doesn’t force the mathematician to 
accept its conclusion on pain of falsehood. Instead, upon reading the proof, 
the mathematician finds it natural to accept it as a proof, to accept its steps as 
logical ones. (‘. . . in the proof I have won through to a decision.’ RFM III 
27(4).) In doing so, if he is in agreement with his community, the 
mathematician has decided to modify the concepts involved, or better, to 
replace them with new ones. (“The proof places this decision in a system of 
decisions.’ RFM III 27(5).) Consider, for example, the proof that there is no 
construction for the trisection of an angle: 
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In the course of this proof we formed our way of looking at the trisection of the angle, 
which excludes a construction with ruler and compass. (REM IV 30(7)) 


On another occasion, 


The concept which the proof creates may for example be a new concept of inference, 
a new concept of correct inferring. (RFM 41(9)) 


In general, as with any activity according to a rule, the process of proving 
produces new criteria for correct proving, new criteria for application of the 
concepts involved in the proposition proved. In doing so 


. .. the proof changes the grammar of our language, changes our concepts. It makes 
new connections, and it creates the concept of these connections. (It does not 
establish that they are there; they do not exist until it makes them.) (RFM III 31) 
(See also RFM III 41 and IV 29-31, 36, 45.) 


To take an example, suppose a student has been taught to recognize 
squares on sight. He is now presented with (and accepts) the proof that a 
diagonal of a square divides it into two congruent isosceles triangles. He is 
then handed what looks like a square piece of paper, one he would have been 
inclined to classify as square with his purely visual concept. But, when he 
folds it along a diagonal and notes that the triangles formed are not 
congruent, he denies that it is square. According to Wittgenstein, his criteria 
for the application of the concept ‘square’ have changed, which is just 
another way of saying that the meaning of the word ‘square’ has changed, 
that his concept of square has changed. Similarly, a new and accepted 
calculation provides a new criterion for correct calculation. It wasn’t correct 
before it was performed, but now it’s pattern is a new entry among the 
paradigms of language.‘ 

‘This idea of proof as concept formation has some interesting immediate 
consequences. If a proof is to serve as a model by which future actions are 
measured, it must surely be surveyable (RFM III 43), perspicuous (RFM 
III 1), memorable (RFM 9g). A proof convinces us to accept a new model, 
and this is accomplished not by abstract, ideal strings of symbols, but by 
patterns which can be used as paradigms. From this it follows that 
abbreviations are not ‘mere’: 


. . . if you have a proof-pattern that cannot be taken in, and by a change of notation 
you turn it into one that can, then you are producing a proof, where there was none 
before. (RFM III 2)? 


Similarly, the introduction of decimal notation is a non-trivial improvement 
on a stroke notation because we couldn’t really learn to calculate in the stroke 
notation as we do in decimal notation (RFM III 12). Finally, it follows that 


1 See RFM I 32, 76; III 31; VI 22; LFM, p. 73, and Chihara [1963], especially p. 27. 

2 In his [1970], Essenin-Volpin also holds that abbreviations are not trivial, and that proofs in 
different notations should be distinguished. He comes to these views via a rejection of 
mathematical induction in particular rather than a broader scepticism about rule-following. 
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mathematics is not logic, or at least not Russellian logic. In the logicist 
reduction of mathematics to logic, the proof of 532 +621 = 1153 is supposed 
to be the proof that a certain very long formula is a tautology. The problem is 
that this tautology is simply too long to serve as a paradigm; indeed our very 
conviction that there is such a tautology is based on our acceptance of the 
calculation in ordinary decimal notation, not vice versa: 


If someone tries to show that mathematics is not logic, what is he trying to show? He 
is surely trying to say something like:—If tables, chairs, cupboards, etc. are swathed 
in enough paper, certainly they will all look spherical in the end. He is not trying to 
show that it is impossible that, for every mathematical proof, a Russellian proof can 
be constructed which (somehow) ‘corresponds’ to it, but rather that the acceptance 
of such a correspondence does not lean on logic. (RFM III 53) 


The logical proof is surely not the foundation on which the decimal proof 
rests because in case of any divergence between the two, we would surely 
trust the version in decimal notation (RFM [II 17-18, 57). Though 
Wittgenstein rejected the view that mathematics is logic as early as the PR 
(see PR 103—107), the present grounds of this rejection are new, springing as 
they do from general rule-following considerations. 

Some light can be cast on the subtleties of Wittgenstein’s late position on 
mathematics by comparing it with the verificationism or game formalism of 
the PG. Three fundamental changes stand out. First, the rule-following 
argument has ruined any hope of providing a traditional semantic theory, a 
theory of mathematical senses which determine the correctness or incorrect- 
ness of future uses. This undermines both verificationism, according to 
which some version of verification conditions predetermines what is correct 
and incorrect, and game formalism, where the rules of the game perform this 
function. This problem more or less dictates the second major change, the 
replacement of verification conditions or game rules by public criteria. The 
proof of the proposition no longer determines its meaning in the traditional 
sense, but the ability to recognize proofs in agreement with his fellows is 
what admits a mathematician into the community of mathematicians, that 
is, the community of those who mean something by their use of mathema- 
tical propositions. The third and final difference is that it is not enough fora 
community to utter a proposition and to agree publicly on its correct and 
incorrect uses; the proposition remains without meaning unless it has an 
identifiable role in the life of that community. 

Despite these deep differences, the verificationism of the PG and the 
criteriological approach of the RFM share an emphasis on the role of proof 
in the determination of meaning. It is worth asking to what extent the new 
position avoids the difficulties associated with the old. Recall that the 
problems faced by mathematical verificationism in the PR and the PG 
spring from an overly close connection between the meaning of a mathema- 
tical proposition and its truth. The proposition is to have sense if there is a 
complete description of its proof, but if there is such a description, there is a 
proof, and the proposition is true. This leads to the unwelcome conclusions 
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that false propositions and unproved conjectures are meaningless, and that 
one can never prove the proposition one set out to prove. As we’ve seen, 
Wittgenstein took great pains during the transitional years to avoid these 
conclusions, without notable success. 

The rule-following argument helps a bit by loosening the connection 
between a complete description of a proof and the proof itself. (Working out 
the complete proof from the description is a special sort of rule-governed 
behaviour and thus its result is not predetermined by the description alone.) 
This suggests that there might be a complete description of a proposition’s 
proof, and the proposition thus be sensical, without it being yet determined 
whether or not the proposition is true. (This depends on how the 
community carries out the complete proof and how it reacts to that pattern 
when it has it.) But Wittgenstein takes little solace from this small separation 
between meaning and truth. As late as the PI (1945-1949) he writes: 
When longing makes me cry “Oh, if only he would come!” the feeling gives the 
words ‘meaning’... But here one could also say that the feeling gave the words truth. 


And from this you can see how the concepts merge here. (This recalls the question: 
what is the meaning of a mathematical proposition?) (PI 544) 


On the new view, proofs don’t provide standard senses, but they do serve as 
public criteria of both meaningfulness and truth, and the connection 
between these two is still uncomfortably close: for a mathematical pro- 
position to be meaningful, it must have public criteria, and if it has such 
criteria, it has a proof and must be true. 

Predictably, then, the old problems re-arise, if in slightly modified form. 
For the verificationism of the transitional period, the question about 
unproved conjectures or false mathematical propositions is how they can 
have sense when they don’t have proofs. In the climate of the RFM, no 
proposition has a determinate sense in the standard way, but the meaningful 
ones do have public criteria and identifiable roles. Conjectures and 
falsehoods certainly lack public criteria, that is, proofs, and thus it seems 
impossible to learn their meanings, to tell if someone else means something 
by them, in short, to attach any meaning to them at all. The question is 
whether or not we understand them: 

... if I am to know what a proposition like Fermat’s last theorem says, must I not 
know what the criterion is, for the proposition to be true? And I am of course 
acquainted with criteria for the truth of similar propositions, but not with any 
criterion for the truth of this proposition. (RFM VI 13; see also V 42) 

And, as before, this generates the problem of how we could ever prove the 
proposition we set out to prove: 


“I am going to show you how there are infinitely many prime numbers” presupposes 
a condition in which the proposition that there are infinitely many prime numbers 
had no, or only the vaguest, meaning. (RFM VI 14) 


From the point of view of public criteria, the situation here seems no better 
than during the transitional period. 
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Still, Wittgenstein makes a gesture towards alleviating these problems in 
the new context. Recall that public criteria aren’t the whole story of 
linguistic meaning on the criteriological approach; expressions must also 
have identifiable applications in the life of the linguistic community. If this 
community role is overlooked, 


... one is looking at language without looking at the language-game. (RFM VII 
10(9)) 


In mathematics, the proof leads us to assert a proposition, and thus provides 
a criterion for its truth, but the proposition’s use is also determined by what 
we then go on to do with the proposition: 


The proof convinces us of something—though what interests us is, not the mental 
state of conviction, but the applications attaching to this conviction. (RFM III 25) 


Without this application, our assertion of the proposition is pointless at best; 
Wittgenstein even doubts that we understand the proved proposition if we 
are unable to apply it (RFM V 7). 

How does this stress on the necessity of application help with the problem 
of unproved and false mathematical propositions? As long as these lack 
proof, it seems they will also lack application, as it is the proof which gives 
the proposition its use (RFM III 29(4)). Surprisingly, Wittgenstein is 
prepared to waive this connection of proof and application in some cases. 
The first indication of this possibility comes toward the beginning of the 
RFM: 


But can’t I believe the geometrical proposition without a proof, for example on 
someone else’s assurance?— And what does the proposition lose in losing its proof? — 
Here I presumably ought to ask: “What can I do with it?” for that is the point. 
Accepting the proposition on someone else’s assurance—how does my doing this 
come out? I may for example use it in further calculating operations, or I may use 
it in judging some physical fact. . . . But I feel a temptation to say: one can’t believe 
that 13 X 13 = 196, one can only accept this number mechanically from somebody 
else... .. At any rate I can say: “I believe it”, and act accordingly . . .—in short it 
depends on what you can do with the equation 13 x 13 = 196. For testing it is doing 
something with it. (RFM I 106) 


By the end of the book, in a passage added by the editors in the second 
edition, he is explicit on this point: 


It is clear that one can also apply an unproved mathematical proposition; even a false 
one. The mathematical proposition says to me: Proceed like this! (RFM VII 
72(10-11)) 


These two passages suggest that Wittgenstein might try to avoid the 
unhappy conclusion that false and unproved mathematical propositions are 
all meaningless by ascribing meaning to those which can be applied despite 
lacking proofs. (Though compare RFM VII 40(6), 45(12).) It must be 
admitted, however, that this move would not square well with his account of 
the role of proof and of public criteria in general. 
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So it seems that these immediate difficulties with the verificationism of 
the transitional period are not significantly improved by the new criteriologi- 
cal approach. It is now worth asking whether the new position has the same 
consequences as the old concerning the relationship of mathematics to 
science and the nature of the infinite. On the first of these topics, recall that 
the argument of the transitional period for a strong distinction between 
mathematics and science rests on the idea that an empirical proposition 
could have several independent supports, while a mathematical proposition 
could have only one proof. (See also LFM, pp. 131—132.) This question of 
whether or not a single mathematical proposition can be proved in a number 
of different ways receives a considerable amount of attention in the REM as 
well. 

In the next context, Wittgenstein is most sensitive to the fact that claiming 
a mathematical proposition cannot have two proofs is counterintuitive. 
Early on in the RFM, he writes: ' 


Of course it would be nonsense to say that one proposition cannot have two proofs — 
for we do say just that. (RFM III 58) 


But despite his desire to admit two proofs for a single proposition, he has 
trouble finding room for this possibility in his scheme. Comparing a proof in 
stroke notation with one in decimal notation, he considers the possibility 
that they prove the same thing: 


— The same thing? What is the same thing? — Well, the stroke proof will convince me 
of the same thing, though not in the same way.—Suppose I were to say: ‘“The place 
to which a proof leads us cannot be determined independently of the proof.” —Did a 
proof in the stroke system demonstrate to me that the proved proposition possesses 
the applicability given it by the proof in the decimal system—was it e.g. proved in the 
stroke system that the proposition is also proved in the decimal system? (RFM III 
57) 


On the criteriological approach, it seems unavoidable that the proof is part of 
its own calculus (RFM III 58), that one proof cannot be replaced by another 
(RFM III 59), and that the conclusions of two different proofs have 
different criteria and thus different meanings. A new proof of an old 
proposition would automatically give that proposition new criteria, a new 
meaning, and thus convert it into a new proposition. 

‘These conclusions apparently bothered Wittgenstein, for he returns to 
this question late in the RFM: 


Now how about this—ought I to say that the same sense can only have one proof? Or 
that when a proof is found the sense alters? (RFM VII 10(1)) 


Here he realizes that though these seem unavoidable results of his account of 
meaning in terms of public criteria, they can be moderated by attention to 
the linguistic role of the asserted propositions: 


It all depends what settles the sense of a proposition, what we chose to say settles its 
sense. The use of the signs must settle it; but what do we count as the use?— That 
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these proofs prove the same proposition means, e.g.: both demonstrate it as a suitable 
instrument for the same purpose. And the purpose is an allusion to something 
outside mathematics. (RFM VII 10 (3-5)) 


So, propositions with two different proofs might be counted as the same if 
they have the same extra-mathematical application. (See also REM VII 43.) 
This is not likely to help us identify the propositions proved in stroke 
notation with those proved in decimal notation, but it might, for example, 
lead us to the identification of propositions about curves with propositions 
about equations (RFM VII 10(11)). In isolation from their application, 
though, in the context of pure mathematics, these propositions remain as 
distinct as ever: 


When two proofs prove the same proposition it is possible to imagine circumstances 
in which the whole surrounding connecting these proofs fell away, so that they stood 
naked and alone, and there were no cause to say that they had a common point, 
proved the same proposition. One has only to imagine the proofs without the 
organism of applications which envelopes and connects the two of them: as it were 
stark naked. (RFM VII 10 (14-15)) 


Thus within mathematics proper, the counterintuitive conclusions remain, 
but from the perspective of the language game as a whole, Wittgenstein finds 
us a way to make the identifications we are inclined to make. 

It must be noted that this appeal to the role of applications to prevent 
unpleasant inferences from the criteriological approach is much more 
plausible than the one mentioned a few pages back. There the goal was to 
attribute sense to propositions without proofs (false propositions and 
unproved conjectures) by reference to their possible applications. Within 
Wittgenstein’s scheme, it is hard to imagine how any proposition could have 
a role in the language game without first enjoying public criteria for correct 
assertion. The present appeal to role in the linguistic community doesn’t 
require this, only that two sets of public criteria (f.e. two proofs) be 
combined as criteria for the same proposition. 

In any case, the manner in which we are now allowed to say that a 
mathematical proposition can have several proofs does little to aid those 
resisting the stark distinction between mathematics and science. We are now 
able to avoid the counterintuitive view that a mathematical proposition can 
have only one proof, but we cannot do this within mathematics itself. 
Scientific propositions can be said to be supported by several types of 
evidence without appeal to anything extra-scientific, while the parallel claim 
for mathematical propositions requires support from the extra- 
mathematical. This contrast still serves as a distinction between the two. 

During the transitional period, this contrast could be traced to the view 
that mathematical statements are analytic, while scientific ones are syn- 
thetic. In that context, a mathematical proof is seen as an analysis of the 
meaning, while supports for empirical propositions come from outside. But 
when the rule-following argument eliminated the possibility of a traditional 
semantic account of meanings which determine the correctness and 
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incorrectness of future usage, it took with it any hope of an appeal to a 
contrast between what is true by virtue of facts about such meanings and 
what is true by virtue of facts about the world. 

Nevertheless, Wittgenstein manages to resurrect a version of this contrast 
in his new criteriological context in the form of the distinction between 
criteria and symptoms. Mathematical propositions (seen from within 
mathematics) differ from scientific ones in that evidence for the latter can 
take the form of both criteria and symptoms, while evidence for the former is 
all criterial. For this reason, no mathematical proposition can have two 
proofs without the second one constituting a new criterion and changing the 
meaning of the proposition, while a scientific proposition can acquire new 
evidential connections in the form of symptoms which do not by themselves 
change the meaning of that proposition. As might be expected, the idea that 
there is no such thing as a symptom in mathematics goes back to the 
transitional period (PR (171(11—12)); PG (22(9), 24(10-11), 26(4)). 

Wittgenstein’s argument for the claim that mathematics is all criteria 
while science involves symptoms as well is central to the RFM, and its main 
thrust is quite simple: he sets out to show that calculations and proofs are not 
experiments. Presumably calculations and proofs establish the evidential 
relations of mathematics while experiments do this job in natural science. If 
Wittgenstein can show that the relations established by the former are 
meaning relations, criteriological connections, while those established by 
the latter are symptomatic, his job will be done. The argument turns, then, on 
the nature of the differences Wittgenstein finds between them. I’ll take these 
up in the next section. 

Having concluded that calculations and proofs are not experiments, 
Wittgenstein argues as follows. Since the evidential connections of mathe- 
matics are established by the former and the evidential connections of 
science by the latter, it follows that there is a sharp distinction between the 
two. Furthermore, the details of the disanalogies between proofs and 
experiments show that the distinction falls between the two disciplines along 
the lines originally suggested: mathematical connections, established by 
proofs and calculations, are purely criteriological (grammatical, conven- 
tional) while scientific connections, established by experiment, are sym- 
ptomatic (empirical, contentful). Thus we hear that 


. . . a proof helps communication. An experiment presupposes it. (RFM III 71(1)) 


There is no doubt at all that tn certain language games mathematical propositions play 
the part of rules of description, as opposed to descriptive propositions. (RFM VII 6) 


And we are treated to rhetorical questions like these: 


... why should not mathematics, instead of ‘teaching us facts’, create the forms of 
what we call facts? (REM VII 18(7)) 


Can we say that mathematics teaches an experimental method of investigation, 
teaches us to formulate empirical questions? (RFM VII 44) 
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So, though the treatment has varied somewhat and surely become more 
elaborate, the sharp divide between mathematics and science which began in 
the Tractatus and carried through the PR and the PG is still strong in the 
RFM. 

The second important question for the philosophy of mathematics raised 
in connection with our comparison of the late view with that of the 
transitional period concerns the status and nature of the infinite. On this 
issue, the new position is less stark than the old, but I think the upshot is not 
much different. Talk of infinity is.not strictly forbidden, but it is under 
suspicion: 

It is essentially a perspective, and a far-fetched one. (Which does not express any 


reproach.) But it must always be quite clear how far-fetched this way of looking at it 
is. For otherwise its real significance is dark. (RFM V 21) 


What is this real significance? We might be tempted to find it in our pictures 
of the infinite: 


The difficulty is not that we can’t form an image. It is easy enough to form some kind 


of image of an endless row, for example. The question is what use the image is to us. 
(REM V 6) 


If not from these images, where does our talk of infinity get its significance 
and use? As during the transitional period, the only real infinity is that of the 
calculus: 


“Ought the word ‘infinite’ to be avoided in mathematics?” Yes; where it appears to 
confer a meaning upon the calculus; instead of getting one from it. (RFM II 58) 


When we are tempted to think that there is a subject matter to mathematics 
outside the calculus, we are brought up short: 


Suppose it were said: “By calculating we get acquainted with the properties of 
numbers’’. But do the properties of numbers exist outside the calculating? (RFM III 
58) 
During the transitional period, the justification for identifying a piece of 
mathematics with its notational form was obscure. In the criteriological 
context, it is no longer surprising; recall the defense (discussed earlier) of the 
refusal to identify arithmetic in stroke notation with arithmetic in decimal 
notation. Mathematics is linguistic, without subject matter, essentially tied 
to the techniques of its calculus. 

Of course there is no actual infinity in the notations of mathematics, but 
this should not be taken to mean that Wittgenstein is advocating finitism: 


Finitism and behaviorism are quite similar trends. Both say, but surely, all we have 
here is . . . Both deny the existence of something, both with a view to escaping froma 
confusion. (RFM II 61) 


The behaviourist’s mistake, according to Wittgenstein, is to mistake the 
criteria for the application of sensation terms (characteristic behaviour) for 
the referent of the term. The fact is that these terms have no referents; they 
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are not part of a descriptive language game. Here we havea similar situation. 
The criteria for statements about the infinite are proofs, but these finite 
objects are not the referents of those statements. In mathematics, just as in 
sensation talk, we are not in a descriptive language game. Just as sensation 
statements cannot be defined in terms of, or replaced by, statements about 
behaviour, statements of infinitary mathematics cannot be defined in terms 
of, or replaced by, finitary statements. 1 

As before, one of the first consequences of the view that infinity is in the 
infinite possibilities of the calculus is an attack on modern set theory. 
Perhaps because of his new rejection of finitism, Wittgenstein is now willing 
to admit that the technique of one-to-one correspondence gives some sense 
to the term ‘infinite sequence’, and thus to the numeral Xo, but he does not 
think this is enough to justify our talk of the number of infinite sequences: 


From the fact, however, that we have an employment for a kind of numeral which, as 
it were, gives the number of members of an infinite series, it does not follow that it 
also makes some kind of sense to speak of the number of the concept ‘infinite series’; 
that we have here some kind of employment for something like a numeral. For there 
is no grammatical technique suggesting employment of such an expression. For I can 
of course form the expression: “‘class of all classes which are equinumerous with the 
class ‘infinite series’ ” (as also: “‘class of all angels that can get on to a needlepoint”) 
but this expression is empty so long as there is no employment for it. Such an 
employment is not: yet to be discovered, but: still to be invented. (RFM IT 38) 


This employment apparently has not yet been invented: 


There is no system of irrational numbers—but also no super-system, no ‘set of 
irrational numbers’ of higher-order infinity. (RFM II 33) 


(See also PI 426.) Though this new, non-finitist, approach to set theory is 
somewhat more lenient than the earlier one, little more of current cardinal 
arithmetic would survive it. 

Let me conclude this survey of the criteriological approach to mathe- 
matics with a look at an aspect which did not appear during the transitional 
period: the requirement that meaningful mathematical propositions have an 
identifiable role in the language game in addition to public criteria. This 
requirement has already been put to use several times in what has been said 
so far, but I want to call attention to one of these in particular: the rejection 
of the set theoretic idea of the cardinal number of the set of all simply infinite 
series on the grounds that no role had been developed for it (RFM II 38). 
Many complaints of this sort are launched against set theoretic ideas, for 
example: 

“These considerations may lead us to say that 2° > No- . . But if we do say it—what 
are we to do next? In what practice is this proposition anchored? It is for the time being 


a piece of mathematical architecture which hangs in the air, and looks as if it were, let 
us say, an architrave, but not supported by anything and supporting nothing. ... 


1 Compare Kripke [1982], pp. 106-7. 
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Certain considerations may lead us to say the 10+? souls fit into a cubic centimeter. But 
why do we nevertheless not say it? Because it is of no use. Because, while it does 
conjure up a picture, the picture is one with which we cannot go on to do anything. 
(REM II 35-36) (See also RFM II 40-41, 56.) 


This same sort of argument is used against Frege’s logical calculus: 


The formalization of logic did not work out satisfactorily. But what was the attempt 
made for at all? (What was it useful for?) Did not this need, and the idea that it must 
be capable of satisfaction, arise from a lack of clarity in another place? The question 
“what was it useful for?” was a quite essential question. For the calculus was not 
invented for some practical purpose, but in order ‘to give arithmetic a foundation’. 
(RFM III 85(9—10)) 


Against the true, unprovable sentence in the proof of Godel’s incomplete- 
ness theorem: 


You say: “..., 80 P is true and unprovable”. That presumably means: “Therefore 
P”. That is all right with me—but for what purpose do you write down this 
‘assertion’? It is as if someone had extracted from certain principles about natural 
forms and architectural style the idea that on Mount Everest, where no one can live, 
there belonged a chalet in the Baroque style. And how could you make the truth of 
the assertion plausible to me, since you can make no use of it except to do these bits of 
legerdemain? (RFM App. III 19) 


And against Dedekind’s cuts: 


Originally, the geometrical illustrations [of analysis] were applications of Analysis. 
Where they cease to be this they can be wholly misleading. What we have then is the 
imaginary application. The fanciful application. The idea of a ‘cut’ is one such 
dangerous illustration. (RFM V 29) 


The odd thing about all these arguments is that the pieces of mathematics 
referred to seem to have perfectly good roles: the number of simply infinite 
series and Cantor’s theorem are central to set theory; Frege’s system and 
Dedekind cuts are useful in the foundations of arithmetic and analysis; and 
Godel’s sentence plays an essential role in one of the most important 
theorems of our century. What can Wittgenstein mean by accusing them of 
failure to meet the requirement of identifiable role in the language game? 
The answer to this question is quite straightforward and revealing: 


I want to say: it is essential to mathematics that its signs are also employed in mufti. It 
is the use outside mathematics, and so the meaning of the signs, that makes the sign- 
game into mathematics. (RFM V 2. See also LEM, pp. 33, 223.) 


Concepts which occur in ‘necessary’ propositions must also occur and have a 
meaning in non-necessary ones. (RFM V 42) 


(See also RFM VII 32.) This view leads inevitably to the rejection of 
considerable amounts of what is ordinarily thought of as mathematics. This 
fact bothers Wittgenstein in one place: 

If the intended application of mathematics is essential, how about parts of 


mathematics whose application—or at least what mathematicians take for their 
application—is quite fantastic? So that, as in set theory, one is doing a branch of 
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mathematics of whose application one forms an entirely false idea. Now, isn’t one 
doing mathematics none the less? (REM VN 5(1)) 


Within pages of this point, however, he has overcome his qualms. Speaking 
of set theory again, he writes: 


But isn’t it evident that there are concepts formed here—even if we are not clear 
about their application? But how is it possible to have a concept and not be clear 
about its application? (RFM V 7) 


The parts of pseudo-mathematics in question, then, satisfy the requirement 
that meaningful propositions have public criteria of correct assertion, but 
they fail to satisfy the requirement that the practice of asserting and denying 
them have an identifiable role in the activity of our linguistic community. 
They fail at this because the role in question is required to be outside 
mathematics itself. 

One final remark. Wittgenstein sometimes seems to deny that he is 
suggesting any reform of mathematics: 


Itis not that a new building has to be erected, or that a new bridge has to be built, but 
that the geography, as it now is, has to be described. (RFM V 52) 


The same impression might be drawn from his insistence that his aim is 


... not to attack Russell’s logic from within, but from without. That is to say: not to 
attack it mathematically—otherwise I should be doing mathematics—but its 
position, its office. (RFM VII 19) 

But by these remarks, I think he means that his attack on particular pieces of 
mathematics will not take the form of showing that they embody some 
mistake in logic, or that they are wrong by some mathematical criterion of 
correctness, but that their very status as mathematics is dubious. Thus: 


What I am doing is, not to show that calculations are wrong, but to subject the interest 
of calculations to a test. (RFM II 62) 


This determination not to challenge pieces of mathematics on mathematical 
grounds is perfectly consistent with an intention to show that they are not 
really meaningful on philosophical grounds. And success in the latter would 
recommend a reform of mathematics just as surely as success in the former: 


. . . a quotation from Hilbert: ‘No one will turn us out of the paradise which Cantor 
has created.” I would say, “I wouldn’t dream of trying to drive anyone out of this 
paradise.” I would try to do something quite different: I would try to show you that it 
is not a paradise—so that you’ll leave of your own accord. I would say, ‘““You’re 
welcome to this; just look about you.” (LFM, p. 103. See also p. 141.) 


So, though Wittgenstein does not repeat his explicit suggestion from the 

transitional period that mathematics should be “pruned”, he stil] seems to 

have something quite like this in mind. 

1 There has been considerable debate in the literature over whether or not Wittgenstein 
advocated a reform of logic, and hence of mathematics, along intuitionistic lines. In the text, 


I’m suggesting that there is a strong sense in which Wittgenstein is advocating a reform of 
mathematics that is independent of questions about the law of the excluded middle. 
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This view of mathematics as essentially linguistic, without subject matter, 
non-descriptive, non-scientific, and in need of pruning, is not likely to 
appeal to realistic philosophers of mathematics. In theory, these and the 
further conclusions on mathematics detailed in the previous section, can be 
reached just as those for private language are reached, by following the 
methodology of the criteriological approach to language, that is, by 
inquiring into the assertability conditions of particular expressions and by 
examining the role of the practice of asserting and denying these expressions 
plays in the life of the community at large. What I want to suggest in this 
section is that these conclusions are not simple consequences of this 
approach, that drawing them involves further non-trivial premises that 
mathematical realists might naturally reject. 

Let me begin with the analogy between mathematics and sensation talk. 
Though the number of pages of Wittgenstein’s available opus devoted to 
mathematics is quite large, the most explicit application of the criteriological 
approach is to sensation talk. We are often left only with the advice that 
similar considerations apply to mathematics: 


The confusion and barrenness of psychology is not to be explained by calling it a 
“young science”; it is not comparable to physics, for instance, in its beginnings. 
(Rather with that of certain branches of mathematics. Set theory.) For in psychology 
there are experimental methods and conceptual confusion. (As in the other case 
conceptual confusion and methods of proof.) . . . An investigation is possible in 
connection with mathematics which is entirely analogous to our investigation of 
psychology. (PI II, p. 232) 


(See also PI 426, 352.) In fact, as we’ve seen, the idea of an analogy between 
mathematics and sensation talk goes back to the transitional period where 
the verificationist semantics gave them both a firmer connection with reality 
than that enjoyed by physical object statements. 

This observation and the passage quoted earlier both draw attention to a 
related recurrent theme: the contrast between sensation talk and mathe- 
matics on the one hand, and science on the other. In the PI, Wittgenstein 
stresses that psychology differs markedly from physical science: 


Misleading parallel: psychology treats of processes in the psychical sphere, as does 
physics in the physical. Seeing, hearing, thinking, feeling, willing, are not the subject 
of psychology in the same sense as that in which the movements of bodies, the 
phenomena of electricity etc., are the subject of physics. You can see this from the 
fact that the physicist sees, hears, thinks over, and informs us of these phenomena, 
and the psychologist observes the external reactions (the behavior) of the subject. (PI 


571) 


We have already discussed the sharp differences drawn between mathe- 
matics and physical science in the RFM. 

So we are to see mathematics as like sensation talk, and both of these as 
yery different from physics. This scheme of alliances seems to arise at a very 
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primitive level, for example, at the level of simple reference to the purported 
objects of study: 


One can refer to an object when speaking by pointing at it. Here pointing is part of the 
language-game. And now it seems to us as if one spoke of a sensation by directing 
one’s attention to it. (PI 669) 


Of course, we can’t refer to sensations in that way. It is tempting to say that 
we refer to them indirectly (indeed this is suggested by the long quotation in 
the previous paragraph), but the discussion of the beetle in the box (PI 293) 
shows that in fact we don’t refer to anything; despite appearances, the 
language-game is not descriptive. This point is also expressed by saying that 
no reality corresponds to a non-descriptive language game. Speaking of 
mathematics, Wittgenstein says: 


So if you forget where the expression “‘a reality corresponds to” is really at home— 
What is “reality”? We think of “‘reality’’ as something we can potnt to. It is this, that. 
Professor Hardy is comparing mathematical propositions to propositions of physics. 
This comparison is extremely misleading. (LFM, p. 240) 


Mathematics and sensations talk are on a par, then, because they are both 
non-descriptive, that is, in neither case is there a reality to which they 
correspond. 

In the RFM, we find another quite fundamental difference between 
descriptive and non-descriptive games: 


One would like to say that the understanding of a mathematical proposition is not 
guaranteed by its verbal form, as is the case with most non-mathematical 
propositions. (RFM V 25) 

Because their form suggests that they are descriptive when in fact they 
aren’t, the understanding, the correct use, of a non-descriptive expression 
requires a more comprehensive idea of where that expression fits into the 
linguistic practices of the community. Ignoring this leads to misunderstand- 
ings of both sensation talk and mathematics. 

So the mark of a descriptive language game is that reference can be fixed 
by pointing. Of course, this doesn’t mean what the realist means by the 
fixing of reference; rather, it just happens that the criteria for someone’s 
using the word to refer can be specified in this way. For example, the 
introduction of the word ‘table’ might be accompanied by a pointing gesture 
towards a table, and thereafter, the public criterion for someone’s under- 
standing the word is his using it in the presence tables. (Of course what 
counts as another table is determined only by our tendencies to see this as 
‘the same’ as that.) The role of this practice in the linguistic community is 
obvious: we use the term to talk about various items in our environment— 
tables. In this sort of case, criteria come as close as they ever do to coinciding 
with the ordinary Tarskian truth conditions disallowed by the rule- 
following argument. PÄTT 

Now let’s recall what stops sensation talk from being descriptiverin this% a <j 
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sense; presumably the reasons will run parallel in the case of mathematics. 
The answer, discussed in section I, is that pointing towards an inner 
sensation cannot provide a criterion for the simple reason that that sensation 
is not public. No one else can check my usage of the term against a standard 
only I am privy to. And so, because there is now no difference between my 
thinking I’m checking and my actually checking, I can’t even check my own 
usage against this purported standard. The problem is that criteria must be 
public and inner pointing is not. 

How does this publicity. requirement affect mathematics? The answer is 
easy if the subject matter of mathematics is taken Platonistically: mathema- 
tical entities inhabit a non-spatiotemporal universe of precise properties and 
necessary interrelations. Such entities are not susceptible to pointing. In 
Kripke’s words: 


Mathematical statements are generally not about palpable entities: if they are indeed 
to be regarded as about ‘entities’, these ‘entities’ are generally suprasensible, eternal 
objects.? 


Thus criteria for terms purportedly referring to mathematical entities 
cannot be fixed in the way required for a descriptive language-game. 

Two remarks are in order here. First, notice that this application of the 
publicity requirement to the case of mathematics uses an additional premise, 
namely, that mathematical entities are not public. The analogous premise in 
the case of sensations seems uncontroversial, but in the mathematical case, it 
comes to the acceptance of a strong version of Platonism. Second, the 
ontologies of several contemporary versions of mathematical realism are 
different enough from the Platonistic realm as traditionally understood to 
deny this premise. Examples of public mathematical objects are Resnik’s 
instantiations of mathematical structures, Kitcher’s ‘affordances’ for our 
collecting operations presented by the physical world, and my own sets of 
medium-sized physical objects (see note 2). I might also mention Parsons’s 
types of physical inscriptions.” These can be perceived and pointed at; they 
are public, unlike sensations. It might be objected that infinite and pure 
mathematical objects cannot be pointed at, but the same is true of physical 
objects like molecules without keeping that language game from being 
descriptive. 

But how can this be?—the conclusion that mathematical criteria are what 
they are was supposedly derived in the previous section by simple 
application of the criteriological methodology. Consider again the case of 
the simple mathematical statement ‘There are two towers in the 
International Trade Center’. It was argued that one of its assertability 
conditions would be the speaker’s inner inclination to count as he does. This 
condition is not public, however, so the criterion for correct application of 


i Kripke [1982], p. 105. 
2 Parsons [1979-80], pp. 142-168. 
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the mathematical expression ‘two’ was taken to be using it when others are 
also inclined to use it. Compare this case with the case of ‘table’. There also 
the criterion for correct use of the expression is use that agrees with the 
primitive inclinations of other speakers. The difference is that in the case of 
‘table’, we can give a further specification of the circumstances under which 
people will be so inclined, namely in the presence of tables. So the public 
criterion for the correct use of ‘table’ is using it in the presence of tables. In 
the case of ‘two’, on the other hand, there is nothing public that corresponds 
to our inclinations to count as we do, so the only public criterion available is 
the verbal behaviour of others. Thus all that can be said in this case is that 
the public criterion for the correct use of ‘two’ is using it when others are 
inclined to do so. 

The realist might remain unconvinced. He could insist that something 
public does correspond to our primitive inclinations to count as we do, for 
example, the presence of sets with certain number properties. The set of 
towers in the Trade Center is just as public as any physical object; it has two 
elements just as surely as that physical object is a table. Once again, this is 
not to say, as the rule-following argument is supposed to show that we 
cannot, that ‘two’ refers to two-element sets independently of our practice 
and inclinations to see as ‘the same’ any more than ‘table’ refers to tables 
independently of our practice and inclinations. The point is rather that the 
criteria for correct application of both terms involve the presence of the sort 
of entity the realist would like to say the term refers to, and thus, that both 
language-games are descriptive in Wittgenstein’s sense. 

This shows that Wittgenstein’s idea that mathematics is non-descriptive, 
like sensation talk, does not follow by simple application of the criteriologi- 
cal methodology. The extra premise required—that mathematical entities 
are not public—is one a realist might fairly reject. The question that remains 
is why Wittgenstein holds that mathematical entities fail the publicity 
requirement. One answer would be that he thinks of mathematical entities 
Platonistically. This seems quite unlikely. Another answer would be that 
his argument is aimed at those who think of mathematical entities 
Platonistically, but while this is true much of the time, it will not help here. 
In applying the criteriological approach, he is not arguing against a mistaken 
view; he is presenting what he takes to be the best alternative to the mistaken 
views (like Platonism) which the rule-following argument has undermined. 
In fact, I think Wittgenstein’s reason for thinking mathematical entities are 
not public is quite simple: he thinks there are no mathematical entities, only 
mathematical calculi.2 This premise has remained constant since the 
Tractatus, with as little supporting argument here as there. 


1 I put the example in terms of sets, but the other realists referred to in note 2 would express 
matters differently. 

2 This is not to say that for Wittgenstein, numerals don’t stand for numbers (PI 10). But this 
‘stands for’ statement doesn’t mean what it would mean in a descriptive language game. See 
Kripke [1982], pp. 75~76. 
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At this point, the job of undermining Wittgenstein’s sharp disanalogy 
between mathematics and science has come a considerable distance. The 
idea that the language-game of mathematics is not descriptive in the sense 
that talk of physical objects is descriptive had been shown to rest on the 
question-begging assumption that mathematical entities are not public. 
Another source of the disanalogy is the purported contrast in the nature of 
evidential relations: mathematical evidence is criterial, linguistic, conven- 
tional, while scientific evidence is symptomatic, empirical, contentful. A 
sharp distinction of this sort has been with Wittgenstein since the 
Tractarian contrast between sense and nonsense, through the transitional 
period’s distinction between empirical and grammatical. Quine’s sub- 
sequent attack on the analytic/synthetic distinction has pushed dichotomies 
like these out of philosophical fashion. Even Wittgenstein himself some- 
times admits that the symptom/criterion distinction is shifting and inexact 
(Z 438, 466; BIBR, p. 25; PI 79). Other writers have persuasively attacked 
the supposed special status of criteria, so I will not reargue that case here.? 

The possibility remains that Wittgenstein has drawn a real and significant 
distinction between mathematical and scientific evidential relations that is 
independent of that between criteria and symptoms. Indeed he does spend a 
considerable amount of time arguing that proofs are not experiments. The 
deepest and more characteristically Wittgensteinian of these arguments 
begins with the claim that the relationship of a proof or calculation to its 
result is not the same as that of an experiment to its outcome. To see this, we 
are to consider our reactions to unexpected results: 


If 2 and 2 apples add up to only 3 apples, i.e. if there are 3 apples there after I have put 
down two and again two, I don’t say: ‘‘So after all 2+2 are not always 4”; but 
“Somehow one must have gone”. (RFM I 157) 


Ought I to say now: “‘It was an experiment again, but I was certain of the result”? But 
am I certain of the result in the way I am certain of the result of the electrolysis of a 
mass of water? No, but in another way. If the electrolysis of the liquid did not yield 
...-, I should consider myself crazy, or say that I no longer have any idea what to 
say. (RFM I 76) 


The initial contrast is between cases in which the unexpected result causes 
me to doubt that my physical assumption is true (as a surprising chemical 
result makes me doubt my theory of the chemicals involved) and cases in 
which the unexpected result makes me think the experiment has somehow 
been contaminated (as when I assume one of the apples must have vanished 
without my noticing). One might think that cases of the second sort are just 
those in which I am more sure of the underlying principle; this would 
explain why I consider it more likely that the experiment has misfired than 
that the principle is wrong. But, in the electrolysis passage, Wittgenstein 
insists that if that underlying principle is physical, one’s reaction is not the 
same as if it is a mathematical. 


1 For example, see Rorty [1973], and Chihara and Fodor [1965]. 
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His next move is to offer an explanation for this disanalogy. The first 
suggestion of the proposed solution comes early in the RFM: 


What am I calling “the multiplication 13 x 13’? Only the correct pattern of 
multiplication, at the end of which comes 169? Or a ‘wrong multiplication’ too? 
(RFM I 112) 


Of course it is part of our calculating practice not to accept different results 
for the same calculation (RFM VI 20). A few years after the above passage, 
the answer to the ‘wrong multiplication’ question is made clear: 


'The reason why one really cannot say that one learns [a mathematical] proposition 
from experience is—that one only calls it this experience when this process leads to 
this result. The experience meant consists as such of this process with this result. 
(RFM IV 50) 


A calculation, then, is not this calculation if it does not reach this result; a 
proof is not this proof if it doesn’t reach this conclusion. Calculations and 
proofs are meant to serve as models, as patterns, and these include both 
process and result. The experiment, by contrast, is independent of its result. 
(See RFM I 10g.) This explains the difference in our reactions to 
unexpected results. In an experiment, this might mean a number of things: 
the samples are impure, our physical assumption was mistaken, or 
something else entirely. In a calculation or a proof, the unexpected result 
shows that we weren’t performing the calculation or proof we had intended. 

So, the idea that experiments are simply processes, while proofs and 
calculations are processes with particular conclusions, accounts for the 
purported differences in our reactions to unexpected results. But is there 
any independent justification for this view? In fact, it can be developed 
directly from general rule-following considerations. To see this we are to 
compare the claim ‘If you follow the rule for successor, you will say that 5 is 
the successor of 4’ with the very similar one ‘If you follow the rule for 
successor as best you can, you will say that 5 is the successor for 4’. The point 
of this exercise is to show that only the second of these is a genuine empirical 
prediction; the first is a mathematical proposition in disguise: 


The reason why “If you follow the rule, this is where you’ ll get to” is not a prediction 
in that this proposition simply says: ‘“The result of this calculation is...” and that is a 
true or false mathematical proposition: The allusion to the future and to yourself is 
mere clothing. (REM VI 15(12)) 


(See also REM VII 4.) We would not say that you were following the rule for 
successor if you got any other answer, thus getting the answer 5 for the 
successor of 4 is a criterion for correct calculation. 4+1=5 is 
overdetermined: 


Thus the truth of the proposition that 4+1 makes 5 is, so to speak overdetermined. 
Overdetermined by this, that the result of the operation is defined to be a criterion 
that this operation has been carried out. (RFM VI 16(6)) 
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(See also RFM IV 7.) So these simple rule-following considerations lead us 
to the conclusion that the result of a calculation is part of that calculation, 
and thus, to a disanalogy between calculation and experiment. 
Furthermore, they do so in such a way as to show that calculations and 
proofs in fact establish new criteria, in contrast to the symptomatic 
connections which can be made experimentally. 

In responding to this line of thought, let’s first return to the pivotal claim 
that we react to an unexpected result of a proof or calculation by assuming 
we made some mistake, while we react to an unexpected result of an 
experiment by assuming that our physical theory is incorrect. I think the 
contrast drawn here is not as clear as it seems. If the physical principle on 
which my expectation about an experiment is based is sufficiently well- 
established, I think our response would be what Wittgenstein claims is 
peculiar to proof, namely, we would suspect malfunction of some kind. 
Thus, for example, I suppose the chemist would assume he had started with 
a sample that was not pure water, or that his apparatus was faulty, if the 
experiment he took to be the electrolysis of water didn’t yield hydrogen and 
oxygen. On the other hand, if the result of the proof or calculation is not 
firmly established, we might well question that conclusion rather than the 
execution of the proof or calculation. If a mathematician remains un- 
convinced by the proof a number of other mathematicians have put forward 
for a surprising result, he might well assume that the proof is erroneous and 
the supposedly proved proposition false. In both the physical and the 
mathematical cases, an unexpected result casts doubt first on the link most 
likely to be wrong: if the expected result is well-established, that doubtful 
link is the proof or experiment; if the result is doubtful and the procedure 
carefully done, the doubtful link is the result. 

But even if this conclusion is correct, it is not enough to defuse 
Wittgenstein’s analysis of the relationship of process to result in the two 
cases, because that analysis also enjoys an independent defense based on 
rule-following considerations. By its nature, a rule-following situation 
involves a non-empirical fact: 


When I write down a bit of a series for you, that you then see this regularity in it may 
be called an empirical fact, a psychological fact. But, if you have seen this law in it, 
that you then continue the series in this way—that is no longer an empirical fact. 


But how is it not an empirical fact?—for “‘seeing this in it’? was presumably not the 
same as: continuing it like this. 


One can only say that it is not an empirical proposition, by defining the step on this 
level as the one that corresponds to the expression of the rule. 


Thus you say: “By the rule that J see in this sequence, it goes on in this way.” Not: 
according to experience! Rather: that just is the meaning of this rule. (RFM VI 26) 


This non-empirical fact is a mathematical proposition: 


Suppose I have taught somebody to multiply; not, however, by using an explicit 
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general rule, but only by his seeing how I work out examples for him. I can then set 
him a new question and say: “Do the same with these two numbers as I did with the 
previous ones”. But I can also say: “If you do with these two what I did with the 
others, then you will arrive at the number . . .””. What kind of a proposition is that? 


“Tf you do with these numbers what I did with the others, you will get . . ..’—that 
surely means: ““The result of this calculation is . . .”’»—and that is not a prediction but 
a mathematical proposition. (RFM VII 4) 


(See also RFM VI 15(12).) This identification of the mathematical 
proposition with the non-empirical proposition explains why: 


~ 


A mathematical proposition stands on four feet, not on three; it is over-determined. 
(REM IV 7) 


(See also REM VI 16(6).) Getting the result we do when we follow a rule in a 
proof or calculation is part of what it is to follow that rule, so we couldn’t get 
a different result and be doing the same proof or calculation. This explains 
why the appearance of an unexpected result must always lead us to assume 
that the proof or calculation has misfired, rather than to the conclusion that 
the mathematical proposition in question is false. (We say ‘I counted 
wrong’, not ‘2+2 must not be 4 after all’. See RFM I 157.) 

When applied to mathematical cases, this conclusion may be exactly what 
one hoped for, but it is important to notice that the range of these 
considerations reaches beyond mathematics. Though the artificial concen- 
tration on mathematical examples in rule-following discussions might 
suggest it, mathematics is not the sole custodian of rules. Wittgenstein 
acknowledges this: 


The concept of the rule for the formation of an infinite decimal is—of course—not a 
specifically mathematical one. It is a concept connected with any rigidly determined 
activity in human life. The concept of this rule is not more mathematica! than that of: 
following a rule. Or again, this latter is not less sharply defined than the concept of 
such a rule itself. (RFM VII 42) 


But even the use of the word ‘rigid’ in this admission is a bit deceptive; rule- 
following considerations apply to even the most commonplace referential 
connections of our language, such as calling certain things ‘tables’, and we 
rarely think of all referential behaviour as ‘rigid’. (This is probably due to 
the unavoidable vagueness of many terms.) At any rate, if the above conclu- 
sions are correct, they must apply to these non-mathematical cases as well. 

This line of thought leads to some surprising results. Suppose I am taught 
to use ‘table’ to refer to tables. Then if I ‘see the same regularity’ that my 
teachers see (RFM VI 27), if I ‘use the words as my teachers do’ (RFM VII 
4), I will say that this (my cat) is not a table and that that (the thing 
surrounded by poker players) is. These facts are non-empirical. 
Furthermore, I see no reason to suppose that ‘the allusion to myself and to 
the future’ has any more relevance in this case than in the mathematical one 
(RFM VI 15(12)). Thus I can hardly avoid the conclusion that the 
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statements “This is not a table’ and “That is a table’ are just as over- 
determined as the mathematical propositions discussed earlier. But aren’t 
these typical empirical propositions?! 

As I mentioned, most people will find the conclusions of over- 
determination and non-empirical character just right in the mathematical 
context, but when seemingly empirical propositions are caught in the same 
net, we must all become suspicious. Though these cases are not emphasized 
by Wittgenstein, he does mention them in passing: 


Someone asks me: What is the color of this flower? I answer: “‘red’’.—Are you 
absolutely sure? Yes, absolutely sure! But may I not have been deceived and called 
the wrong color “red”? No. The certainty with which I call the color “red” is the 
rigidity of my measuring-rod, it is the rigidity from which I start. When I give 
descriptions, that is not to be brought into doubt. This simply characterizes what we 
call describing. (RFM VI 28) 


I take this passage to mean that “This flower is red’ is non-empirical and 
over-determined in this context. Once this characterization has made 
inroads into the realm we usually think of as empirical, there is only one way 
to keep it from invading the entire physical object and natural science 
language-game: the distinction between criteria and symptoms. This 
helps explain the extreme importance of this distinction within the 
Wittgensteinian scheme. Rule-following considerations are used to give a 
non-empirical character to mathematics, but rules occur outside mathe- 
matics. It is thus unavoidable that some seemingly empirical propositions 
become criteriological, but it would be too counterintuitive to extend this to 
statements like ‘It is likely to rain’ uttered in the presence of a falling 
barometer. Thus this last cannot be part of what constitutes following the 
rules for use of the word ‘rain’; it must be the result of an inductively 
established symptom. 

To clarify this use of the criteria/symptom distinction, compare what 
Wittgenstein says about the connection between divisibility into congruent 
isosceles triangles and squareness (see section 2) with his line on the 
connection of barometer readings with rain: 


The fluctuation in grammar between criteria and symptoms makes it look as if there 
were nothing at all but symptoms. We say, for example: ‘Experience teaches us that 
there is rain when the barometer falls, but it also teaches that there is rain when we 
have certain sensations of wet and cold, or such-and-such visual impressions.” In 
defense of this one says that there sense impressions can deceive us. But here one fails 
to reflect that the fact that the false appearance is precisely one of rain is founded ona 
definition. (PI 354) 


In each case, we learn to use the word under certain circumstances: such- 
and-such sense-impressions. In both cases, the connection between these 
sensations and the application of the term is ‘overdetermined’ in 
Wittgenstein’s sense. As I mentioned in the case of tables, it is already odd to 
say that my saying ‘It’s raining out’ in the presence of certain sense- 
impressions is non-empirical, but the oddness becomes worse when the 
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range of evidence is extended. In the mathematical case, we come to accept 
the proof that a square can be divided into congruent isosceles triangles, this 
gives us a new criterion for application of the word ‘square’, which is to say, a 
new concept of ‘square’. In the empirical case, we do various experiments, 
and come to accept barometer readings as evidence for correct application of 
the word ‘rain’. Why is this new form of evidence not criterial? Why don’t 
we say that we’ve changed our concept of ‘rain’? Of course it would be 
counterintuitive to count every bit of scientific progress as a case of meaning 
change, but I see no more reason to say this of mathematical progress. 

As the sharp criterion/symptom distinction is not tenable, it cannot be 
appealed to here to forestall the conclusion that all seemingly empirical 
statements are non-empirical and overdetermined in the same sense that 
mathematical statements are. In the current context, I don’t think we can 
separate cases in which getting a given result is part of what constitutes 
following the rule, and cases in which the rule is simply followed. If this is 
right, there is no reason to insist that mathematical cases always fall in the 
former category, or that proofs and experiments can be distinguished on 
these grounds. Mathematics and science are both rule-governed activities, 
but this gives us no grounds on which to distinguish between them. 

I have concentrated in this section on Wittgenstein’s analogy between 
mathematics and sensation talk, and his disanalogy between mathematics 
and science. I have argued that these depend on more than the pure rule- 
following and criteriological considerations that motivate the private 
language argument. In particular, in applying the criteriological approach to 
the case of mathematics, Wittgenstein uses two additional premises—the 
non-publicity of mathematical objects and the sharp distinction between 
criteria and symptoms—both of which a realistic philosopher of mathe- 
matics might well reject. I conclude that the unpalatable picture of 
mathematics sketched in section 2 is by no means inevitable, even if the rule- 
following argument is sound. 
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Theory-Assessment in the 
Historiography of Science 


by JAMES W. McALLISTER 


1 History and Norms of Rationality 
2 Varieties of Relativism 

3 Comprehension and Criticism 

4 The Roles of Evaluation 


This paper argues that evaluation of the truth and rationality of past scientific 
theories is both possible and profitable. The motivation for this enterprise is traced 
to recent discussions by I. Lakatos, L. Laudan and others on the import of history for 
the philosophy of science; several objections to it are considered and T. S. Kuhn is 
found to advance the most substantive. An argument for establishing judgements of 
rationality and truth in the face of scientific revolutions is presented; finally evidence 
is offered for the value of such assessments to historiography and to debates on 
scientific progress. 


Iı HISTORY AND NORMS OF RATIONALITY 


The historian of science A. Koyré revelled in exclaiming of past thinkers 
‘And he was rightl’. This paper’ will argue that evaluations of the truth and 
rationality of past science form a proper part of the historian’s task. 
Evaluations of past truth consist in the attribution of our own truth-values to 
previous scientific assertions, whilst evaluations of rationality mean the 
tracing of any divergences between past inferences and those that we would 
regard legitimate under the same premises. To establish our conclusion it 
will be necessary to discuss the philosophical motivations for such 
assessments, some arguments purporting to show their impossibility or 
undesirability, and the roles of evaluation in historiography. Theory- 
assessment will be found much more valuable for our understanding of past 
science than merely going through history picking off wrong answers. 
The motivation for this enterprise springs from the recent reaction to a 
previously standard view according to which history and philosophy were 
separate if complementary ways of studying science. The historian on this 
view was dealing with facts and data, seeking to arrange them into a coherent 
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and convincing tale about how scientific ideas have evolved; philosophy of 
science was by contrast perceived as a normative, evaluational and largely a 
priort investigation of how science ought to proceed. 'This view has been 
criticized by many recent thinkers including T. S. Kuhn, I. Lakatos and L. 
Laudan. Whilst granting that the aim of philosophical inquiry is the 
generation of a set of norms, including some to direct the evaluation of 
theories, these critics have pointed out that any philosophical theory of 
science which failed substantially to square with the history of science would 
be deemed unacceptable: we should reject for instance an account of theory- 
change which interpreted a substantial part of the history of science as 
irrational. This criticism implies that history has an evidential role for 
philosophy of science. 

But this last point presupposes acceptance of a thesis equally foreign to the 
standard view: that evaluations of the truth and rationality of past science are 
potentially well-founded. If we did not believe ourselves capable of making 
assessments of past science, we would be unable to gauge the standard of 
rationality of episodes of actual history and therefore to judge the extent of 
the agreement between those episodes and models of rationality embodying 
our philosophical norms. The cited critics of the standard view have 
accordingly each developed ways for evaluating past science in order to 
ascertain the degree of concordance between their methodological rules and 
actual history. This is the aim of Lakatos’ theory of rational reconstructions, 
which purports ‘to explain how the historiography of science should learn 
from the philosophy of science’ ([1971], p. 91; emphasis in original). In 
order to erect a rational reconstruction of the historical record one tells 
history as it ought to have happened; the actual beliefs of the historical 
agents who star in the story are often ignored or deliberately distorted. 
Sometimes not just the truth-values of the agents’ beliefs but also their 
canons of rationality are modified: in discussing the hypotheses on chemical 
composition of the nineteenth-century physician William Prout, for inst- 
ance, Lakatos urges the historian to ignore one of Prout’s methodological 
tenets about the experimental grounding of theories. Once an episode has 
been thus recast, Lakatos proceeds to appraise its rationality. Yet whatever 
the verdict, the historical episode itself and the beliefs that figure in it remain 
unevaluated as well as unexplained. 

Aware of the inadequacies of Lakatos’ view, Laudan has proposed an 
intuitionist base upon which to provide history with an evidential role. 
Laudan believes he identifies within the history of science a number of ‘pre- 
analytic intuitions’ of rationality, constituting ‘a subclass of cases of theory- 
acceptance and theory-rejection about which most scientifically educated 
persons have strong (and similar) normative intuitions’ ([1977], p. 160). 
This set contains among others the intuition that it was rational by 1890 to 
reject the view that heat was a fluid, or that it was irrational after 1830 to 
accept the biblical chronology as a literal account of Earth history. “The test 
of any putative model of rational choice is whether it can explicate the 
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rationality assumed to be inherent in these developments’ (sbid., p. 161; 
emphasized in the original). 

This construal of the evidential relation between history and method- 
ology meets a number of difficulties;* one of interest here concerns Laudan’s 
account of how to make evaluative judgements in the historiography of 
` science, and particularly his assumption that it is possible and un- 
controversial to place confidence in pre-analytic intuitions about scientific 
rationality. The term ‘intuition’ is often a euphemism for common belief 
engendered by prolonged exposure or persuasion, a highly insecure basis 
upon which to rest our conception of scientific rationality. Under further 
cogitation, to have an intuition is to possess immediate knowledge of a 
concept without being able to define it; thus to present a certain historical 
episode as intuitively rational is to claim that it is rational without being 
capable of explicating why it is so or what form its rationality takes. This 
assertion would be historiographically insufficient because by definition 
unsupportable by evidence. So even in the more respectable sense of the 
term ‘intuition’, Laudan is gravely oversimplifying the task of the historian 
by denying any problematic feature to the interpretation of key episodes in 
the history of science. In view of Laudan’s praiseworthy emphasis on the 
importance of history to philosophers of science, it is disappointing to find a 
similar misrepresentation of historiographic analysis. 

The problem which Lakatos and Laudan are here failing to despatch is 
most clearly revealed by the aid of the familiar distinction between two 
senses of the term ‘history of science’, one denoting the actual past 
development of science and the other the descriptive and explanatory 
accounts composed about that development. Authors like Lakatos and 
Laudan, concerned to establish an evidential relationship between history 
and philosophy, focus upon the import for methodology of the actual past 
development of science; but since that development can be accessed only 
through the writings of historians, and since for an evidential relation to be 
established we require evaluations of past science, the problem thereby 
raised is that of determining whether normative elements from the 
philosophy of science are able within accounts of historical episodes to assess 
the truth and rationality of past science. 

It is universally acknowledged that philosophical beliefs about science 
already influence the historian’s selection and treatment of his subject- 
matter: scholars as different as Koyré and Kuhn will give radically divergent 
accounts of the same episodes. But a further level at which philosophical 
judgements enter the written history of science is that of the epistemic 
assessment of past theories. Whereas the former normative influence is a 
pervasive conditioning which the individual historian may find hard to 
recognize explicitly in himself, the latter is generally a conscious use of a 
historiographic tool. As discussed in the next section, some currents of 


1 Some of these difficulties are discussed in Brown [1980], pp. 238-40, on which I have drawn. 
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thought have seen in this deliberate intrusion of modern methodological 
norms into past science a violence against historicism which must be 
shunned. 


2 VARIETIES OF RELATIVISM 


Among the Egyptians and Arabians, the Paracelsians, and some other moderns, 
chemistry was very phantastic, unintelligible, and delusive. [. . .] The Royal Society 
have refined it from its dross, and made it honest, sober and intelligible [. . .]. 


Joseph Glanvill ([1668], p. 12) believed it eminently natural to formulate 
normative evaluations of past science, and his confident expectation is still 
shared by some today. It is noted that one of the primary aims of the written 
history of science is to explain why certain experiments, theories and 
research traditions were accepted, rejected or modified: many such explana- 
tions will involve normative assessments. The query ‘Why did Newton 
reject Descartes’ vortex-theory of planetary motion?’ will elicit the response 
‘Because Newton judged the vortex-theory grossly incompatible with data 
about the positions and velocities of the planets’. The inquiry then arises, 
“Was Newton correct in his criticism of the Cartesian theory, and why did he 
believe that a theory incompatible with data was to be rejected?’, striving to 
establish the truth-value of Newton’s beliefs and the standard of his 
rationality. We should be better informed on this episode if we were able to 
conclude e.g. that Newton had been justified in his dissatisfaction with 
Descartes’ theory, or that he held a non-rational opinion of why theories at 
odds with data should be rejected; an unwillingness to refer to normative 
judgements would deprive us of this knowledge. Indeed even to conceive of 
a history of science one has to decide what counts as science and this arguably 
requires the introduction of evaluative norms. 

Despite this expectation, strands in recent philosophy of science have 
tended to doubt the desirability and at times the very possibility of 
formulating such judgements. Similar currents of thought may in this 
context fittingly be termed varieties of relativism since they hold that past 
beliefs can be examined only in relation to historical background, thus 
abandoning substantive portions of the quest for commensuration. Five 
such stances will here be considered: historiographic conventionalism, 
externalism, philosophical nonalignment, hypothetico-deductivism and 
the Kuhnian school.! 

1. The historiography of science has recently witnessed a reaction against 
the complacency that H. Butterfield has called ‘Whiggish’ ([1931], esp. pp. 
9-31) and J. Agassi ‘inductivist’ ([1962], e.g. pp. 1-3). This consists in the 
assumption that the present state of science is the ideal towards which have 
striven all previous beliefs, practices and institutions and is in this sense the 
paradigm of achievement. The inductivist historian thus inclines in the light 


1 The following survey is partially inspired by Hesse [1973], pp. 128-31. 
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of contemporary knowledge to view all outdated theories as trivially false if 
not ridiculous and the past as an epic: Butterfield saw as an instance of this 
misrepresentation 


the tendency in many historians to write on the side of Protestants and Whigs, to 
praise revolutions provided they have been successful, to emphasize certain 
principles of progress in the past and to produce a story which is the ratification if not 
the glorification of the present. ([1931], p. v) 


Post hoc ergo melius hoc: in his opposition to this slogan Butterfield has been 
joined by S. G. Brush, a critic of those historians who 


judge every scientist by the extent of his contribution toward the establishment of 
modern theories. Such an interpretation looks at the past in terms of present ideas 
and values, rather than trying to understand the complete context of problems and 
preconceptions with which the earlier scientist himself had to work. ([1974], p. 1169) 


In the face of such attacks, inductivist historiography has gradually been 
abandoned and efforts have by reaction been redirected towards the 
empathetic immersion of the historian in the complex of beliefs within 
which operated the scientists of interest. This approach, dubbed by Agassi 
‘conventionalism’, has been characterized by mistrust for present-day 
judgements of past science, viewed as a manifestation of biased hindsight. 
The watchword has therefore been passed for historians to abstain from 
appraisals of past truth and rationality. 

2. The issue of epistemic evaluation of scientific ideas has been side- 
stepped by the increasing injection into history of science of categories of 
psychology and sociology under the general name of ‘externalist history’. 
Rationalists believe that internal factors are capable of explaining most of 
the past developments of science, and that external factors are to be invoked 
only when the rationalist model falters. Rational change is by them assumed 
to be the norm: only deviations from it are to be explained by reference to 
perturbing influences. Laudan voiced this opinion clearly: 


When a thinker does what it is rational to do, we need inquire no further into the causes of 
his action; whereas, when he does what is in fact irrational—even if he believes it to be 
rational—we require some further explanation. ({1977], p. 188-9; emphasis in 
original) 

By contrast adherents to the so-called strong programme in the sociology of 
science attack the very notion of the rational explanation of scientific change. 
Whereas the rationalist programme hinges on a differential assessment of 
belief, transmitting to the sociologist all and only those episodes regarded as 
unjustified, proponents of the strong programme like B. Barnes claim that 
all transitions should be attributed the same form of explanation whether or 
not we regard them as rational. On this view, to make our explanations of 
episodes of the history of science depend upon our present judgements 
about the form of rational decisions would be to project our beliefs 
illegitimately. Rather, Barnes writes, ‘our present theories should stand 
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symmetrically with earlier scientific theories’ ([1977], p. 23). As a con- 
sequence of this principle of symmetry, all transitions and not merely so- 
called irrational ones are to be explained by reference to external factors, 
including such diverse influences as the standards of education of a society, 
unconscious psychological motivations and the metaphysical commitments 
of an age. There is no criterion isolating putative episodes in science wholly 
susceptible of internal explanation. This amounts to a denial of the 
specificity of internal history and of the utility in this context of theory- 
assessment: there is no need to evaluate the rationality of past actions before 
deciding on what form of explanation to advance for them. The formulation 
of historical explanations occurs with rationalist eyes shut. 

3. Current philosophy has notoriously failed to provide a universally 
accepted account of scientific method. The contrary proliferation of models 
of rationality has undermined confidence among historians that normative 
evaluations of past science would not be irremediably tied to the fate of the 
pronouncements of individual philosophies which might rapidly lose favour 
among the community. Instantiations of this risk have been perceived in 
some accounts of historical episodes too heavily dependent upon idiosyn- 
cracies of polemic philosophy. Answering in the negative the query ‘Should 
philosophers be allowed to write history?’, L. P. Williams ([1975], p. 252) 
has charged philosophers with exploiting history to substantiate their own 
views on scientific method. Hypothetico-deductivism and the methodology 
of research programmes have appeared recently to spawn a particularly large 
number of partisan case-histories: J. T. Clark [1959], incisively criticized by 
even fellow-deductivist E. Nagel [1959], allows himself the free assumption 
of the postulates of the former to evaluate the actions of the protagonists of 
several past episodes, whilst C. Howson [1976] has assembled a number of 
histories incorporating the conceptual furniture of the latter. Anxious to 
compose accounts of longer life and wider appeal, most mainstream 
historians have shied from such close embrace of philosophical theses. 
Consequently the ideal has been envisaged of purely descriptive histories 
which refrain from evaluations grounded of necessity in philosophical 
doctrines. 

4. Hypothetico-deductivism, indicted above of inducing naive history, 
has simultaneously tended to restrict the scope of evaluations of past 
rationality by its sharp demarcation of the context of justification from that 
of discovery. According to this distinction, originally introduced by logical 
positivism, justification is an eminently public affair, governed by rules and 
thus transparent to critical scrutiny, whereas discovery is of only private 
concern, neither subject to precepts nor susceptible of rational reconstruc- 
tion. Since all paths leading to discovery are characterized by neopositivism 
as alogical, important parts of the historical record are withheld from 
epistemic evaluation. 

5. Assessment of past theories is considered downright impossible by 
philosophies of science placing emphasis on deep discontinuities of ratio- 
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nality or ‘revolutions’. Such views have been advanced since the 19308 when 
G. Bachelard wrote of ruptures épistémologiques ([1934], e.g. p. 176) and L. 
Fleck of the alternation of scientific Denkstile ([1935], esp. pp. 125-45), but 
their leading advocate today is of course Kuhn who explicitly invokes an 
analogy between scientific and political revolutions: in times of ‘normal 
science’ there is widespread agreement within the inquiring community on 
what constitutes solutions to the problems in hand, but in revolution 
scientists do not agree even on the principles that should govern the choice 
between paradigms. Theories embedded in rival paradigms simply cannot 
be compared since there are no theory-neutral principles relative to which 
this comparison could be carried out: 


The normal-scientific tradition that emerges from a scientific revolution is not only 
incompatible but often actually incommensurable with that which has gone before. 
(Kuhn [1962], p. 103) 


This is due to changes in both the meanings of the terms used by the 
scientists and the standards governing theory-preference: the former are 
claimed to preclude assessments of truth and the latter to impede evalu- 
ations of rationality. A typical example of meaning variance is constituted 
for Kuhn by Newtonian and Einsteinian mechanics, the terms of which are 
implicitly defined in each by reference to theory. There is thus no 
guaranteed stability of meaning for terms that figure in scientific theories: 
failure of translation occurs and gauging the-.truth or falsity of past 
propositions is problematic. This worry has been further pursued by W. V. 
Quine ([1960], ch. 2) who argued that there can never be a unique translation 
of a sentence of one language into one of a second because the empirical 
evidence for one scheme of translation could always be reinterpreted in 
favour of another. Quine used this thesis, of the ‘indeterminacy of radical 
translation’, to cast doubt on whether there is a fact of the matter about what 
a sentence means, and hence on whether one sentence can ever ‘mean the 
same’ as another. Worse, even if we assumed invariance of meaning we 
would according to Kuhn find that in revolutions there are changes in the 
evaluative standards applied to theories. Kuhn speaks of paradigm-shifts 
bringing about ‘changes in the standards governing permissible problems, 
concepts and explanations’ ([1962], p. 106), citing as an example the late 
seventeenth-century transition from the visualization of gravity as sus- 
ceptible of mechanistic explanation to its conception as a primary attribute 
of mass and hence not further explicable. Moreover, even if we arbitrarily 
fixed standards of explanation that cut across revolutionary divides, we 
would find it no less impossible to choose between alternative paradigms: 
the measure of the achievement of a paradigm is according to Kuhn the 
number of problems it has solved, but this cannot be taken as a comparative 
standard since each paradigm will in general solve a different subset of the 
problems within its scope and it is even in principle impossible to adjudicate 
between different sets of solved questions weighted in accordance with 
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differing criteria. These difficulties appear to render impossible the task of 
identifying paradigm-neutral canons by which to evaluate the rationality of 
past science and, viewed in conjunction with meaning-drift hampering 
assessments of truth as outlined above, pose a severe threat to the entirety of 
our enterprise. 

To be sure, Kuhn has withdrawn from his earlier more extreme position 
to allow the possibility of partial communication between the proponents of 
alternative paradigms and has offered a list of good-making characteristics 
of theories for which he expects there will be consensus across paradigms. 
The list includes the requirements of accuracy, consistency, breadth of 
scope, simplicity and fruitfulness ([1977], pp. 321-2). Even so Kuhn is far 
indeed from the positions of the rationalist, for he suggests that in the 
competition among paradigms for the support of scientists these factors are 
intended to play not evidential but persuasive roles. The normative force of 
the requirements cannot be justified on any account of scientific method but 
derives rather from their common acceptance in the community of peers. So 
whilst a new paradigm may win the support of scientists by showing a 
greater adherence to the good-making requirements than was exhibited by 
their previous beliefs, this cannot be taken by the scientists as evidence that 
the new paradigm is in any sense objectively ‘better’. Lakatos has famously 
commented that Kuhn’s account renders theory-choice ‘a matter for mob 
psychology’ ([1970], p. 178). 

The consequences for theory-assessment in the historiography of science 
are radical: if there exist no supra-paradigmatic standards of evaluation, the 
historian possesses no objective yardstick against which to judge past 
science. The so plausible undertaking outlined in the previous section is 
thereby reputed impossible. 


3 COMPREHENSION AND CRITICISM 


Kuhn’s arguments for the occurrence of scientific revolutions constitute an 
obstacle in the path of evaluations of past science different from and greater 
than those posed by the other strands in modern philosophy considered in 
the previous section. For whilst historiographic conventionalism or ex- 
ternalism seek merely to demonstrate the undesirability of assessing the 
truth and rationality of past science, Kuhn maintains that such a programme 
is no less than impossible: the division of the course of science into periods of 
normality separated by putative discontinuities of meaning and rationality 
would render it impossible for us to project our judgement beyond the 
period and area of hegemony of our current paradigms. Thus the first task 
for those who would indulge in the assessment of past theories is to establish 
its very feasibility by undermining Kuhn’s scepticism concerning this 
activity. 

Clearly no historiographic activity would be possible given a total failure 
of translation from past texts as envisaged by Quine; the substantive 
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challenge is therefore to secure evaluations of past science in the face of 
partial failure. This has been attempted by M. B. Hesse ([1973], pp. 146-7) 
who aims to establish a sense in which the development of science stands 
outside historical relativity and is absolutely progressive. As she concedes, 
the relativist has demonstrated that conceptual constructs undergo deep 
revolutions and do not converge continuously towards the truth: for this 
reason the progress of science is not a mere increase in the number of true 
observation statements accounted for in successive theories, since the 
language of observation is permeated by theory. But there is a sense, Hesse 
claims, in which science provides us with increasing knowledge, and it is 
related to technological control. It is undeniable that the contemporary 
scientist is better able to predict and manipulate phenomena than were his 
predecessors: 


Because this sense of the progress of science is about controlled happenings, it is 
independent of the way facts are described relative to different theories. [. . .] It 
provides an absolute criterion of distinction of our rationality [. . .]. (Ibid., p. 147) 


Hesse suggests that the resultant continuity provides a basis upon which to 
ground evaluative judgements of past theories, even across putative 
paradigm-switches. 

This argument does not appear fully persuasive since the continuity 
which it envisages is merely instrumental: if the sole invariant feature of 
science were its predictive power, all evaluation of past science open to us 
would consist exclusively of a numerical tally of its empirical successes and 
not touch upon issues of the truth-value or rationality of theories, so missing 
the burden of our enterprise. Hesse here appears to have conceded excessive 
ground to her opponents. Conversely, relativists like P. Feyerabend would 
not admit Hesse’s premise that there has in history been a monotonic 
increase in the predictive power of science, stressing on the contrary that 
past theories scored some successes not replicated by current science, and 
that prediction is in any case not the sole aim of theories. A different base 
upon which to ground evaluations of the rationality and truth of past science 
would in the face of these Kuhnian rebuttals appear desirable: one will be 
outlined now. 

First, the sole way in which to identify a scientific revolution is surely by 
noticing that our estimation of the canons of rationality employed at around 
its time changes at the boundary that the putative revolution constitutes. 
Inferences peculiar to past paradigms will diverge from those that we under 
the same premises would find legitimate to draw, and given a pair of past 
paradigms the characteristic inferential patterns that we perceive in each 
will diverge from ours in different ways. If they did not, and hence there 
were no visible discontinuity of reason in the period examined, it would 
make no sense to postulate the occurrence of a revolution. So the trans- 
paradigmatic evaluation of rationality is no less than a precondition for the 
formulation of the concept of radical standard variance if the latter is to refer 
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to identifiable historical junctures. It will no doubt happily be noted by 
Kuhnians that evaluations performed for this purpose confer no privilege on 
our own canon of rationality which is therein employed solely as a rational 
null-indicator to reveal discontinuities that, as Kuhn too claims, would 
show on any base adopted. 

Now to evaluations of truth, to which the challenge posed by radical 
meaning variance centres on the alleged impossibility of comprehending the 
more theory-laden terms of past science. The situation here confronting the 
historian of science is analogous to that which faces a physician attempting 
to establish the history of a disease from the answers provided by a medically 
untutored patient. The doctor meets difficulty in comprehending his patient 
for two reasons: first, some of the latter’s terms and expressions may be 
ambiguous or possess meanings different from those which the doctor 
would attribute to them in his own technical vocabulary; secondly, the 
patient’s conception of the processes of disease themselves may depart to an 
unspecified degree from the doctor’s understanding of the truth so it is 
uncertain what the patient is even attempting to describe. A Kuhnian 
practitioner would thus expect no chance of evaluating the accuracy of his 
patient’s assertions and thus of conferring upon them truth-values. The 
historian is said to operate under a similar double handicap: he cannot fully 
understand the scientific text before him which is in any case striving to 
describe a situation that strays from the real to an indeterminate degree. 
Under these conditions, proponents of the incommensurability thesis claim 
that any attribution of truth-value to an outdated theory will be wholly 
arbitrary since it will rely upon a construal of meanings designed precisely to 
lead to the verdict desired by the investigator. This charge of circularity will 
be seen below in reality not to stand. What has given rise to this charge is a 
methodological principle which has suspect legitimacy but is nonetheless at 
present generally invoked against meaning variance by those who would 
carry out evaluations of truth: the principle of charity, that for instance we 
should assign to past terms those referents which render true the greatest 
number possible of the propositions in which those terms figure, or that the 
assigned extensions of past predicates should show the widest possible 
overlap with the extensions of the corresponding predicates in our— 
believed true—theories. (This principle is quite separate from the so-called 
‘principle of humanity’ which calls for the assignation of referents to past 
terms in such a way as to optimize the intelligibility of the statements in 
which they figure.) The principle of charity was first proposed over twenty- 
five years ago by N. L. Wilson ([1959], p. 532) and is nowadays often 
regarded as an a priori methodological rule; D. Davidson endorses this claim 
when he writes that 


charity is not an option, but a condition of having a workable theory [. . .]. Charity is 
forced on us; whether we like it or not, if we want to understand others, we must 
count them right in most matters. ([1973], p. 19) 
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This procedure of minimizing disagreement is especially visible in the case 
of terms which are no longer reputed to refer. For example Joseph Priestley 
believed in the phlogiston-theory because of observations which he 
described in theory-laden terms such as 


When iron is melted in dephlogisticated air, we may suppose that, though part of its 
phlogiston escapes, to enter into the composition of the small quantity of fixed air 
which is then procured, yet enough remains to form water with the addition of 
dephlogisticated air which it has imbibed, so that this calx of iron consists of the 
intimate union of the pure earth of iron and of water [. . .]. ([1785], pp. 299-300) 


The term ‘phlogiston’ is now known not to refer and so cannot be learned 
ostensively: other means are required in order to recover from Priestley’s 
descriptions what this word was intended to denote. By routine charitable 
construal of the terms ‘iron’, ‘water’ and the like as possessing their modern 
meanings, historians have ascribed to Priestley the greatest possible truth- 
content by identifying ‘dephlogisticated air’ with ‘oxygen’. The business of 
assessing in exactly how much truth this maximum consists is thereafter 
unproblematic since an unambiguous glossary of the terms of phlogiston- 
chemistry has been constructed. But the major weakness of this approach 
lies upstream of this point and precisely in the assumption of the principle of 
charity and that hence the greatest possible truth-content should invariably 
be read into past texts. Why, when we know how easy it is to assert a gravely 
incorrect theory? The physician of our example may well under the 
principle of humanity strive to optimize his comprehension of the patient by 
assigning an appropriate interpretation to the latter’s statements, but will 
generally not choose to construe as many as possible of them as factually 
correct, particularly if there are doubts about the sick person’s grasp of 
medical processes: rather will the doctor judge in each case the degree of 
credence to be lent to the patient’s answers by reference to the apparent 
rationality of the reasoning from which they derived. This hermeneutical 
procedure should equally be followed in the historiography of science. The 
principle of charity amounts therein to a withdrawal of autonomy from 
assessments of truth, which are considered to follow mechanically from a 
maximization of the truth-content of past scientific writings. But plainly a 
deliberate decision should be made of the degree of charity to be applied in 
each case: and such decisions are taken by our prior assessments of 
rationality. The optimally rational investigator is the one who of all the 
competitors under equal conditions reaches the most truth; any decrease in 
his standard of rationality would tend to introduce increasing falsehood into 
his reports. So it is wrong to present—as Davidson does—the principle of 
charity as an a priori assumption not admitting of exceptions: our 
expectations of truth must on the contrary be calibrated against our 
estimations of the rationality of our subjects of study. (In turn this latter 
activity is invariably possible as was observed in the previous paragraph.) In 
the case of the phlogiston-theory we must thus evaluate Priestley’s 
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rationality in order to gauge just how closely his descriptions will have 
matched the phenomena before him. Believing him scientifically competent 
we may construe his descriptions favourably, but if he had been a less 
discriminating theorist the intended meanings of his terms might have been 
different from those we would most charitably assume and his arguments 
for the existence of ‘phlogiston’—whatever this term denoted— 
consequently been less secure. So although our esteem of major named 
figures in the historical record may be invariably high, it should be viewed as 
the outcome of no less a deliberate decision for that—a decision to be argued 
and reached on the basis of evaluations of rationality. 

Once reached, the denotations of the terms contained in past theories and 
the truth-values of the assertions in which they figure may be determined in 
the same manner as under an acritical principle of charity. But in the light of 
the above argument those denotations and truth-values are seen to be 
established on the basis of evaluations of rationality rather than on an 
arbitrary methodological principle. Such evaluations are thus rooted in the 
initial search for discontinuities of reason or ‘revolutions’ which was 
considered above. True, whilst in that context evaluation conferred no 
privilege on our beliefs, in this latter connection we openly judge the 
standard of rationality of past scientists by our lights alone. But this is no 
defect of the above argument, for two reasons. First, its aim was to counter 
Kuhn’s contention that evaluations of past truth were—owing to radical 
meaning variance—impossible, and to show that on the contrary such 
assessments are potentially well-founded: and this it has done. A denial of 
the incidence of historiographic bias never was among its self-imposed 
tasks. Secondly, we have been conducting judgements of rationality for 
purposes of historiography, and a written history is inevitably tainted by the 
environment of its composition on many grounds other than these, which 
would thus in any case be swamped. 

To summarize: the above arguments have attempted to show that despite 
Kuhn’s assertion of incommensurability it zs feasible to perform evaluations 
of truth in that it is possible to comprehend past scientific terms by assigning 
to them denotations. In turn this is accomplished with the aid of assessments 
of rationality, which we have noted are invariably possible—albeit biased — 
and indeed essential to the secure foundation of Kuhn’s thesis itself. Neither 
as a consequence do evaluations of truth suffer from circularity: they hang 
not from an arbitrary principle of charity manufactured solely to ensure a 
certain verdict in those same evaluations, but rather from assessments of a 
different species—of rationality. Evaluations thus lie not in a circle but in a 
progression in which both the rationality and the truth of past science may 
securely be gauged by the historian. 

This conclusion is buttressed by the realization that such evaluations have 
concretely been performed in history by scientists themselves in the many 
incidents in which scientific theories were at first rejected or neglected and 
much later revived and accepted as correct. These are cases in which the 
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scientific community has overturned its assessment of a theory after a 
sometimes considerable lapse, pragmatically demonstrating the possibility 
of projecting such judgements over time. For instance, Christiaan Huygens 
published in 1690 a wave-theory of light capable of explaining many optical 
phenomena, including—alone among the available theories—double refrac- 
tion in calcite. This theory was nonetheless rejected for around a century in 
favour of the Newtonian approach with which it was inconsistent and 
revived only in 1801 by Thomas Young. Again, Benjamin Thomson’s 1798 
vibration-theory of heat was perceived as superior to the competing caloric 
theory only in the 1840s by Ludwig Boltzmann. Other examples of theories 
upon which the judgement of the academic community was changed from 
unfavourable to positive after a sometimes lengthy interval include I. 
Semmelweis’ theory of the cause of puerperal fever of 1847, M. Polanyi’s 
1914 theory of adsorption and N. Bohr’s 1927 Complementarity Principle.* 
What matters in such cases is that the scientific community was able to pass 
evaluations of truth and rationality upon past theories; the test of the well- 
foundedness of such evaluations is that in all such cases science was able to 
build upon these past achievements thus revalued and incorporate them into 
current beliefs. They consequently provide prima facie instances of the 
assessment of theories originally formulated in a past time, which counter 
Kuhn’s conclusions to the extent required to establish the possibility of 
theory-evaluation in historiography. 


4 THE ROLES OF EVALUATION 


The possibility of the evaluation of past theories assured, it remains to 
suggest its desirability by countering the strands in recent philosophy 
outlined in section 2 above, which have tended to relativize the historian’s 
criteria of rationality. The issue with which those strands were grappling 
was that of determining what should count as reasons for beliefs in the 
historiography of science. In his influential Foundations of Historical 
Knowledge M. White suggests that when a historian inquires why named 
historical figures held certain beliefs, he may give a rational explanation in 
terms of ‘the reasons stated by the thinkers or half-stated by them, or the 
reasons they would have stated if they had been asked certain questions’.” 
This characterization however leaves ambiguous the status which reasons 
must possess if they are legitimately to be employed in historiography. It 
would appear difficult to maintain that reasons are atemporal motivations to 
which philosophers as such are inevitably subject, for past construals of 
certain arguments appear notoriously distant from those of present-day 


! See eg. on Huygens Laudan [1977], pp. 232-3, on Thomason ibid., p. 83, on Semmelweis 
Sinclair [1909], on Polanyi his own [1963] and on Bohr Holton [1970]. 

2 White [1965], p. 196. White’s views are further analyzed in Hesse cit., pp. 134-9, on which my 
discussion draws. 
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thinkers. The Whiggish shortcomings of this perspective would be avoided 
if we were to give rational explanations of a past belief in terms of what 
would have been seen as reasons at the time, but this option too meets 
immediate difficulties. First, it sets historians the task of discovering what 
reasons truly governed the scientist’s beliefs; this is sometimes difficult since 
such reasons are not always expressed, particularly once the theories 
concerned have become entrenched, and since even when scientists do lay 
out their principles explicitly they may be misdescribing their procedures 
out of lack of awareness or conformity to prevailing intellectual fashion: 
certain past scientists notoriously departed from their stated methodological 
tenets in practice. Secondly, no one form of rationality has commanded 
universal assent in the history of thought: some once fashionable modes of 
argument have long been discredited and so possess no logical explanatory 
force on today’s standards. It is impossible then to advocate using past 
arguments to explain past beliefs when faith in the former is to us no less 
arbitrary and mysterious than credence in the latter. In such cases the 
arguments employed by past scientists do not today constitute the ex- 
planantes of their beliefs but rather must number alongside those same 
beliefs as the historian’s explananda. Copernicus reportedly inferred that 
the Sun is at the centre of the planetary system from its analogy with a 
monarch at the centre of his court: we are not convinced by this argument 
and indeed find it more alien than the conclusion. Copernicus’ belief in the 
latter is thus not to us explained by his faith in the former at least until we 
have found reasons acceptable to us for the sixteenth-century confidence in 
analogies between the Solar System and a royal court. In the Renaissance a 
wide diversity of such accepted patterns of argument transpires, and their 
historian is then far from being able to presuppose at the outset an adherence 
to a certain mode of rationality among his subjects of study: on the contrary 
his task is precisely to determine which forms of inference were then trusted. 
And although epochs since the Renaissance may appear to exhibit less 
divergent or rapidly changing conceptions of rationality, the difficulties of 
principle cast much doubt on the possibility of explaining past beliefs in 
terms of what would have been seen as reasons at the time. 

The option thereby suggested is that of accounting for past beliefs in 
terms of just what strike us as reasons and only later turning to the separate 
problem of establishing how far the justifications—suitably reinterpreted— 
actually once given for those beliefs match those that we have attributed 
to them. This procedure amounts quite simply to the evaluation of past 
thought, the enterprise discussed above: for we are implicitly allowing the 
possibility that whilst certain past beliefs will be vindicated by appeal to 
present reason, others may be found to have been unwarranted. The degree 
of justification that we attribute to beliefs in history constitutes precisely an 
assessment of the rationality of their proponents. Two positive arguments in 
favour of accounting for past beliefs in terms of present-day reasons will be 
outlined here: the claims that evaluative judgements yield additional 
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historical information not easily obtained by other means, and that they area 
precondition for the identifying and explaining of scientific progress. 

The most convincing support for the first argument is provided by 
exemplification. One of the main problems facing Galileo in the Discorsi of 
1638 was that of motion in a resisting medium: in the Fourth Day he 
attempted to show that air resistance was independent of velocity by 
impressing upon one of two identical pendulums oscillations of around 160° 
whilst allowing the other to swing through only 10°, noting their behaviour 
throughout. Galileo tells us that 


if the air offers more resistance to the high speed than to the low, the frequency of 
vibration in the large arcs [. . .] ought to be less than in the smaller arcs [. . .], but this 
prediction is not verified by the experiment; because if two persons start to count the 
vibrations, the one large, the other small, they will not differ by a single vibration, not 
even a fraction of one. ([1638], p. 244) 


‘Yet in 1976 the historian R. H. Naylor correctly quoted for this experiment a 
very different result: 


Using two similar lead pendulums 100 inches long, I allowed one to swing initially 
through 120° and the other through 10°. I found that the pendulums were a quarter of 
an oscillation out of step after eight or nine vibrations. Thus Galileo’s description of 
this particular experiment does not agree with the actual case at all. ([1976], pp. 
400-1) 


It is upon the blatantly evaluative last sentence of this passage and others 
that commentators have pinned their fruitful construal of many Galilean 
experiments as didactic thought-demonstrations manufactured for their 
persuasive power after the completion of the relevant theory rather than as 
true sources for Galileo of raw data. Our replication of this historical 
mechanism has led us to opinions of our own on the text’s veracity and 
thence to a new interpretation of the role of experimental evidence in 
Galileo. An evaluation of past science has here visibly provided the basis for 
much of the current literature on a major figure of the scientific revolution. 

Similarly, Aristotle appears generally to have been a very accurate 
biological observer and dissector;! when his anatomies diverge too grossly 
from the organisms as we know them from our observations, the historian is 
thereby led to suspect that Aristotle is not relating his own findings but may 
be drawing upon metaphysical beliefs or reporting the results of others: 
these suggestions are historically revealing. For instance, we know that 
certain of the corporeal differences which Aristotle claimed to have noted 
between men and women do not exist; that Aristotle contrary to observ- 
ational evidence believed they did may inform us of social attitudes 
prevalent at his time, provided we are prepared to exploit our own 
evaluations of the truth of Aristotelian writings. Again, explicit epi- 
stemologies in the early nineteenth century in Britain were mainly 


1 This example and the next were suggested to me by Hull [1979], p. 13 and p. 10 respectively. 


330 James W. McAllister 


inductivist; on this basis many alleged that the Darwinian theory of 
evolution was unscientific because the product of hypothesis. The historian 
is able to go far beyond a mere description of this controversy by noting that 
Darwin’s methods e.g. in his Notebooks on Transmutation admittedly 
departed from the inductivist precepts of John Stuart Mill but no more than 
did the procedures of other nineteenth-century scientists: this evaluation 
reveals the fact that the methods suggested by Mill and others were not 
embodied in current science and hence that their advocates’ criticism of 
Darwin was not biting. This new information is owed precisely to evaluative 
judgements. Lastly, the most salient problem recently to emerge from the 
study of Oriental science is that of explaining why, although China until 
about 1400 was scientifically and technologically more advanced than 
Europe, the scientific culture developed by the West in the succeeding three 
centuries was signally unmatched there. The explanation advanced by J. 
Needham refers to the centralized control of technological development by 
the mandarinate and to the Chinese bourgeoisie’s impotence to bridge the 
gap between intellectual and manual labour; whatever the correct explana- 
tion, the possibility of even posing this fruitful question depends upon the 
belief that past states of science can be normatively compared. 

The above examples have illustrated the range of historiographic 
problems and explanations access to which is unobtainable without 
assessment of past science and have clearly demonstrated the prize that 
would be lost if such evaluations were banned. But perhaps the more 
powerful argument in favour of the assessment of past theories is that they 
are necessary for the recognition and explanation of scientific progress.! 
Few thinkers in current philosophy doubt the manifestation of some form of 
progress in science: K. R. Popper, Lakatos and Laudan may differ in their 
conception of progress and their theories about its features, but do not 
disagree about its occurrence; even Kuhn does not question this. Progress 
also forms the basis for debates on realism, a common argument for which is 
that without the realist assumption that there is some convergence of 
scientific theories towards the truth, the success of science in correctly 
predicting observable phenomena would be a miracle. Thus any test of this 
assumption involves the detection of progress, and in turn the recognition 
and explanation of progress requires the differential assessment of beliefs. 

First, in referring to progress we are characterizing the beliefs of later 
scientists as containing more truth than did those of their predecessors. We 
are not merely reporting their own perception of their achievement but also 
concurring in this judgement. So we require our own evaluations of past 
efforts in order to recognize the occurrence of progress. Its explanation too 
possesses this requirement: theory-change can be explained by reconstruc- 
tions of lines of reasoning without assessing beliefs, but if an explanation is 
desired of why this transition constituted progress, reference must be made 


1 This argument is presented in Newton-Smith [1981], ch. 10, on which I have drawn. 
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to the truth-content of the theories involved. If we wish to explain past 
transitions we can do so by showing that the participants were acting 
reasonably by their own standards: such explanations do not require us to 
endorse their beliefs. However, if we want to explain why there was progress 
and not merely change we shall need to attribute truth-values to their 
beliefs: the latter activity plainly involves evaluation whereas the former 
does not. Lastly in the explanation of progress the historian will wish to 
eliminate the possibility that science has so far been the product of lucky 
accidents by showing that the reasons of past eras (if necessary interpreted 
into modern terms) partly coincide with or approximate to our own, and 
perhaps show an ordered evolution in scientists’ conception of reasons for 
beliefs. Nonetheless progress cannot to us be explained by past canons of 
rationality except precisely in so far as the latter may be identified with our 
own. 

To illustrate the difference between explanation of theory~change alone 
and of progress we may take the Principia Mathematica, of which Newton 
claims that the method is inductive, proceeding by generalizations from 
singular observations. He presents the axiomatic section which opens the 
Principia as resting on inductive evidence of precisely this kind: the 
implication is that each of his three laws of motion is based on an inductive 
generalization from observed instances of the operation of forces and 
masses. The empiricism and anti-hypotheticalism of this construal was a 
major attraction of the Principia in England and so can explain the belief- 
transitions of many who embraced Newton’s mechanics; but it does not, 
particularly in the light of debates on induction from Hume to the present 
day, explain why Newtonianism was more successful than and hence 
constituted progress over its predecessors. To explain this, the historian lays 
aside the author’s protestations and notes that the three laws are not 
inductive but rather conjectural, depending in a crucial way upon the 
introduction of new and complex explanatory terms, ‘force’ and ‘mass’, 
which are defined within the axiomatic structure and to which classical 
mechanics owes its predictive success. Now there is no suggestion that it was 
this hypothetico-deductive structure that attracted empiricists to Newton: 
they would more likely have objected to the Principia if they had seen it thus. 
But what this construal does purport to explain is why Newton’s mechanics 
was successful as science and why it has been a progressive contribution to 
physics. And to establish this construal we have had to resort to our 
evaluation of Newton’s methodological claims and to our own judgement, 
guided by normative hindsight, of the structure of the Principia. 

Therefore to identify and explain progress and thereby inter alia to permit 
investigation of convergent realist arguments, it is necessary to develop a 
normative model of theory-appraisal and to show that the community of 
scientists have made decisions with results that approximated to it in some 
degree. Recognition of this point and of the impact of the previously 
outlined arguments suggests that the evaluation of the truth and rationality 
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of past science not only is an essential component of the written history of 
science but also has a positive role for epistemological speculation. In 
particular the debate which provided the original motivation for this 
enterprise—the discussion between Lakatos, Laudan and others on the 
evidential role of history for philosophy—has been given a sound base: since 
assessments of past theories are in principle well-founded, the evaluated 
rationality of historical episodes may after all be brought sensibly to bear as 
evidence to adjudicate between competing methodologies. 

If these claims are accepted, not just the possibility but also the 
desirability of such evaluations has been established. In this light it would be 
surely counterproductive to propose the eradication from historiography of 
the exercise of normative judgement. The critical faculties that we so 
enjoyably direct against our contemporaries are profitably trained onto our 
scientific predecessors also. 


Jesus College, Cambridge 
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Discussions 


ON THE DETERMINATION OF PLANETARY DISTANCES 
IN THE COPERNICAN SYSTEM 


In past issues of this Journal, some articles have been published concerning 
the determination of planetary distances in the Copernican system. 

The first article was by Alan Chalmers [1981]. His thesis is that the 
possibility of determining the planetary distances in the Copernican system 
(but not in the Ptolemaic) is a consequence of the purely accidental fact that 
humans inhabit the Earth. If humans inhabited the Sun, then it would be 
the Ptolemaic system (and not the Copernican one) which would allow the 
calculation of planetary distances. 

As Martin Curd and Keith Hutchison showed, Chalmers’ claim is strictly 
false. We may incidentally note that, in replying to Curd and to Hutchison 
Chalmers wrote that ‘there are possible arrangements of the planetary 
system for which my claim would be correct’ (Chalmers [1981], p. 372). We 
think that this conclusion has no interest at all for the discussion, given that 
it is very similar to a strict tautology. Therefore, there would be no need to 
reopen the discussion if Hutchison’s note did not contain, in our opinion, a 
serious mistake. This mistake has been accepted by Chalmers in his reply. 

According to Hutchison, there are two different ways of determining the 
planetary distances in the Copernican system. The first is the well-known 
method of triangulations. But ‘a Copernican astronomer can evaluate the 
relative distances of the planets from the center of the Solar System, directly 
from the tracks of the planets through the Zodiac [. . .| without reference to the 
triangulations’ (Hutchison [1983], p. 370. The italics are added). Hutchison 
himself refers to Otto Neugebauer’s description of this second method 
(Neugebauer [1975], p. 146). 

In the left side of Figure 1, let OS be the mean heliocentric dis- 
tance of the Earth in the Copernican system. Let also be R, r respectively 
the radii of the deferent and of the epicycle of a planet P (inferior or superior) 
in the Ptolemaic system. If one converts the heliocentric circular (or elliptic) 
motion to a geocentric one, and if one takes OS as the unit, then it follows, 
for an inferior planet, a = r/R; for a superior one, a = R/r. ‘Of course, one 
can invert the procedure, and hence obtain the heliocentric distances. 

According to Hutchison, a Copernican astronomer can compare the 
results of these two methods one with other (e.g., he ‘can predict the results 
of solar triangulations before they are carried out’). As a consequence, ‘he 
runs a real risk of having his theory falsified. Since his predictions are in fact 
confirmed, his theory attracts the ‘‘credit’’ of having performed a successful 
novel prediction. It is, then, not simply the fact that the planetary distances 
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Figure r. 


can be determined which distinguishes for the Copernican theory, it is the 
fact that they can be doubly determined’ (ibidem). 

In his reply, Chalmers goes far beyond Hutchison. In his opinion, 
Copernicus himself ‘does not appear to have appreciated the possibility of 
applying both types of method to all the planets and thereby subjecting his 
system to a severe empirical test’ (Chalmers [1983], p. 373). 

However, in our opinion it is possible to show that Copernicus was right, 
and that both Hutchison and Chalmers are wrong. Let us start from 
Hutchison’s claim that a Copernican astronomer can calculate the planetary 
distances ‘directly from the tracks of the planets through the Zodiac, etc.’. 
As it can be seen from the figure above, this claim is not correct. The 
distances are merely drawn from the Ptolemaic system with fixed values of 
all the R’s and r’s, via the assumed geometrical equivalence between the two 
systems. Note that this conversion is possible for any value of R,r, 
irrespective of the observational results they produce. 

But why can this conversion produce the planetary distances in the 
Copernican system? This happens only because all the angular values 
needed for triangulations are implicit in the accepted R/r, r/R ratios. One 
just has to think of Venus’ maximum elongation, which follows from a given 
ratio of r[R.1 We have to stress that the genuine astronomical validity of this 
‘method’ rests completely on the supposition that the accepted ratios have 
observational consequences (t.e., lines of sight) which are true; at least, that 


1 Of course, we do not consider here the problem of the eccentricity of the orbits. 
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those observational consequences are true that are needed to fix the 
planetary distances in the Copernican system. 

What could happen when Hutchison and Chalmers’ comparison of the 
two ‘methods’ were made? Let us consider the two possible cases separately. 

First, let us suppose that there is no agreement between the results. Ceteris 
paribus, this disagreement can have two ‘causes’: 1: The two sets of input 
data furnished to the Copernican model are not congruent (in the 
mathematical sense); 2: The Copernican model is not self-consistent.! (It is 
obvious that the two possibilities are not mutually exclusive.) As one can 
easily understand, none of these possibilities can be considered as an 
empirical test of the Copernican system. Therefore, they do not give the 
logical condition for falsifying it. It is very important to stress that this holds 
irrespectively of the ‘direction’ in which the comparison is done; as a matter 
of fact, there are no a priori reasons for attributing to one ‘method’ a greater 
reliability.* As a consequence, Hutchison’s idea of testing the results obtained 
by conversion against the results obtained by triangulation is unjustified. 
This test can merely show (if we assume the consistency of the Copernican 
model) that some of the observational consequences of the Ptolemaic 
parameters are not congruent with the input data of the triangulations: 
nothing more. 

If we are right, then the second case—that is to say, there exists an 
agreement between the obtained results—is quickly settled. Does it mean 
that the Copernican ‘theory’ has given a prediction, or, still more a 
‘successful novel prediction’? The answer is obviously no. It is matter 
neither of a novel prediction, nor of a prediction at all. Under the hypothesis 
that the two sets of input data are congruent, the agreement of the output 
data merely gives a proof that, in this respect, the Copernican model is 
consistent. 


1 It is not easy to see in what sense the Copernican system could be said to be inconsistent. Asa 
matter of fact, the ‘axioms’ of the Copernican model can be reduced to the statement that all 
the planets move on circles approximately centred on the Sun. Planets’ periods are not related 
to the model, and the metrical (and topological) features of the system are a consequence of 
the model itself plus some sets of empirical data. (As far as these features are concerned, one 
has to add also the condition that the observer is not based on the Sun.) Therefore, the 
eventual inconsistency of the Copernican system could arise either from the geometry which 
has to be logically connected to the model, or from a contradiction between the ‘axiom’ and 
the geometry. There is no need to stress that these are just logical possibilities, without any 
historical or theoretical relevance. 

In some respects this is not true. As a matter of fact, the reliability of a value for a given planet 
obtained from a single observation (or from a determined set of observations) will be, ceteris 
paribus, greater than the corresponding one obtained via its model. One has just to consider 
that the model of a planet is not conceived for maximizing the precision of a single output (¢.g., 
Venus’ maximum elongations). It is conceived (or at least, it was till Newton and also later) in 
order to produce the most accurate planetary tables. But this means that an astronomer will 
have to maximize some kind of ‘average accuracy’. Of course, if the model were true, then 
there would be no difference between the two values. But this is an irrelevant possibility, as far 
as our problem is concerned. It is perhaps interesting to note that, in determining Venus’ 
distance, Copernicus seems to rest on Ptolemy’s observational values for its maximum 
elongations, but not on the r/R ratio (Book 5, chapters XX and XXI). 
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One might be tempted to say that, given the truth of the lines of sight 
generated by a Ptolemaic system with some given parameters, this very 
truth has to be transmitted to the distances obtained by converting the 
Ptolemaic to a heliocentric system (the Copernican or the Tychonic one). 
However, this does not happen. For this purpose, it is sufficient to consider 
that there exists a whole class of possible planetary arrangements which 
share the same lines of sight (or which generate true lines of sight) but which 
do not produce the same planetary distances. 

In order to make this point more palatable, we can produce an 
arrangement which not only satisfies the condition stated above (i.e., it 
generates the same lines of sight), but moreover it is geometrically equivalent! 
to the Ptolemaic one. 

As far as our ‘pseudo-Copernican’ system is concerned (left side of Fig- 
ure 2), one can operate the very same conversions that hold between the 
Ptolemaic and the Copernican systems (see Figure 1); that is to say, one can 
convert all the Ptolemaic R/r, r/R ratios to the distances between the 
system’s centre and each planet as a ratio of the Earth-system’s centre 
distance. Furthermore, as one can easily see, in this system all the planetary 
distances can be obtained by triangulations (in the very same ways as they 
can in the Copernican one). Ceteris paribus, the two ‘methods’ will produce 
the same values. Therefore, this ‘pseudo-Copernican’ system will success- 
fully pass Hutchison’s test, exactly as Copernicus’ system will do. But, of 
course, the two systems are not the same system. (The fact that our system is 
not the true one is even superfluous in order to demonstrate our thesis.) 


1 The Ptolemaic system has an additional degree of freedom with respect to the Copernican 
one, namely, the OC/OS distance. By giving a determinate value to this free parameter we can 
obtain a particular Ptolemaic system which is geometrically equivalent to the Copernican one. 
Any other value of this free parameter defines a different arrangement of the Ptolemaic 
system, from which one could obtain a geometrically equivalent system by a change in 
coordinates only. Therefore, that one shown in Figure 2 is just one out of the class of the 
possible ‘pseudo-Copernican’ systems. 

Incidentally, one could remark that Roserkrantz’s thesis that the Ptolemaic system has two 
additional degrees of freedom by comparison to the Copernican one, is false (Roserkrantz 
[1977], p. 139). According to Roserkrantz, the second degree is represented by the fact that a 
Ptolemaic astronomer could freely arrange the size of the planetary orbs. But this is not an 
additional degree. Once the first parameter is fixed, then the planetary distances are fully 
determined. For example, we may note that by assuming OC = OS, we obtain the Tychonic 
system, in which the planetary distances are the very same as in the Copernican one. 

As one can easily see, in ‘our’ system the centre is no longer represented by a heavenly body, 
but by a geometrical point, to which all the distances have to be referred. Of course, this has no 
influence at all as-far as our theses are concerned. We may perhaps remind the reader that in 
Copernicus’ system the centre of planetary motions is also represented by an empty 
geometrical point, namely, the centre of the Great Orb. However, the ‘pseudo-Copernican’ 
system here shown is not the same as Copernicus’ own system. At this proposal, it is sufficient 
to consider that the relative positions of O, C, and S, are different in the two models. While in 
‘our’ system C is ever on a fixed point of the given line SO (or, in a different version of the 
model, S is ever on a fixed point of the given line CO), in Copernicus’ system the relative 
positions of O, C, and S, do not have to satisfy this condition. From a purely geometrical point 
of view, the ‘pseudo-Copernican’ system is a particular case of the De Revolutionibus system. 
(See also De Revolutionibus, Book 3, chapter XX.) 
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Figure 2. 
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On the basis of these simple considerations, * one cannot but conclude that 
Hutchison’s proposed empirical test for the Copernican system, as well as 
Chalmers’ ‘charge’ against Copernicus, are without any foundation. The 
claimed double determinability of the planetary distances in the Copernican 
system is not a genuine one. It is just a double way of doing the very same 
calculation. This is a confirmation of the well-known fact that, in 
Copernicus’ time, there was only one method for obtaining the (relative) 
planetary distances. Of course, if one accepts the thesis that a theory (or one 
of its consequences) has no empirical value unless there exists an in- 
dependent empirical test of it, then the determination of planetary distances 
does not belong to the empirical aspect of Copernicus’ system. It constitutes 
only a part of its systematic supertority over the Ptolemaic system. 

As the final point, we have to stress that even from the historical point of 
view Chalmers’ claim according to which ‘Copernicus used [. . .] the method 
involving the radii of “Ptolemaic” epicycles and deferents for the superior 
planet, Mars’ (ibidem) is very doubtful. The determination of Mars’ 
distances is given in Book 5, chapter XIX, of the De Revolutionibus. As one 
can easily see, the ‘method’ here used has nothing to do with the conversion 
from the Ptolemaic ratio R/r. Mars’ distances (maximum, minimum, and 
mean) are drawn from Copernicus’ own model of the planet by adding to it 
one determinate value for the anomaly of the eccentric. (This value is 
reported as obtained by Copernicus’ observation of Mars in the kalends of 
January 1512.) Of course, from the logico-geometrical point of view, ceteris 
paribus this procedure does not differ from the triangulations. 

The ‘moral’ we should perhaps draw from all this matter is that one 
should be more careful before promulgating the idea that a giant like 
Copernicus (as well as his opponents and his defenders) merely failed to 


1 Of course, our arguments against Hutchison’s and Chalmers’ theses are completely 
independent of the example of a ‘pseudo-Copernican’ model. 
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appreciate the possibility of performing so easy a test of the heliocentric 
system.! 


ANGELO M. PETRONI and LUCIO SCOLAMIERO 
Centro di Ricerca e Documentazione “Luigi Einaudi’, Torino, Italy 
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1 Aa it is well known, Tycho tried to produce an experimentum crucis between the two greatest 
systems of the world via the comparison between the diurnal parallaxes of Mars and of the 
Sun. It is worth while to remark that this test was possible only by assuming (for the 
Ptolemaic system) the Aristotelian principle that the distances of the planets (including the 
Sun) are proportional to their period (Aristotle, De Coelo, II, 10). On this topic see Petroni 
[1986]. ‘ 

A completely different problem is represented by the relations between the calculation of 
planetary distances in the heliocentric system, and Kepler’s Third law. (Both the Area law 
and the ‘First’ law are logically independent from the calculation of relative planetary 
distances.) As a matter of fact, the Third law would have been impossible if the relative 
planetary distances were very different from those calculated via the Copernican (rather, 
‘Keplerian’) system. Of course, the Third law cannot be considered as an independent 
measurement of the planetary distances. However, it seems intuitively true that the Third law 
has to be considered as giving some confirmation to the heliocentric distances. This law 
establishes a relation between the ‘Copernican’ distances and some independent and quite 
well-determined data—i.e., the planetary periods. Were these distances completely false (or 
even merely ‘theory-laden’), how could it happen that they could give origin to a law when 
connected with the planetary periods? In principle, it is obviously possible to consider the 
Third law as a case of a purely accidental relation—exactly as any law can be considered in this 
way. Under this assumption, this law would give no confirmation at all to the validity of the 
heliocentric planetary distances. But if one accepts the thesis that laws have an ontological 
basis—that is to say, in one way or another they ‘correspond to reality’—then one can 
reasonably argue that the existence of the Third law gives support to the truth of the 
heliocentric planetary distances. (Of course, we assume here that these distances have not 
been calculated ad hoc.) 


HARTRY FIELD ON MEASUREMENT AND INTRINSIC 
EXPLANATION 


In his [1980] Hartry Field, as a side issue to the defence of nominalism, lays 
great stress on the need for ‘intrinsic explanations’ in science. Intrinsic 
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explanations are explanations appealing only to ‘intrinsic facts’, facts about 
the world which are statable independently of the hypostasis of mathema- 
tical entities and the use of particular mathematical representations. Field 
does not give a precise definition of what he means by intrinsic explanation 
but the general idea can be gotten from his examples: 


[T]he fact that geometric laws, when formulated in terms of distance, are invariant 
under multiplication of all distances by a positive constant, but are not invariant 
under any other transformation of scale, receives a satisfying explanation: it is 
explained by the intrinsic facts about physical space, i.e. by the facts about physical 
space which are laid down without reference to numbers in Hilbert’s axioms.! 


If... we need to invoke some real numbers like 6.67 x 107! (the gravitational 
constant in [SI units]) in our explanation of why the moon follows the path that it 
does, it isn’t because we think that that real number plays a role as a cause of the 
moon’s moving that way. ... The role it plays is as an entity extrinsic to the process to 
be explained, an entity related to the process to be explained only by a function (a 
rather arbitrarily chosen function at that).? 


[W]e have specified the continuity of temperature with respect to space-time in a 
completely intrinsic way, a way that never mentions spatio-temporal co-ordinates or 
temperature scales.> 


These quotations should suffice to give some inkling of intrinsic facts and 
intrinsic explanations. 

Field sees as desirable both intrinsic explanations and the elimination of 
certain sorts of ‘arbitrariness’ or ‘conventional choice’ from the ultimate 
formulations of theories.* These two desiderata are not independent: 


[O]ne of the things that gives plausibility to the idea that extrinsic explanations are 
unsatisfactory if taken as ultimate explanations is that the functions invoked in many 
extrinsic explanations are so arbitrary. For example, in the case of geometry, the 
choice of one distance function over any other one which differs from it by a positive 
multiplicative constant is completely arbitrary; it reflects in effect an arbitrary choice 
of units for distance.* 


Satisfaction of these desiderata further what Field sees as a plausible 
methodological premise. Namely, ‘the principle that underlying every good 
extrinsic explanation there is an intrinsic explanation’.© Here the aim is to 
question the extent to which this principle is supported in Field’s discussion 
of measurement. There are two cases to be considered: (a) length measure- 
ment; (b) measurement of a scalar magnitude such as temperature. 


(a) Length measurement. In Hilbert’s axiomatisation of Euclidean 
Geometry— henceforth Hilbert Geometry—only two predicates are used: a 


1 Field [1980], p. 27. 
2 Ob. cit., P. 43. 
3 Op. cit., p. 63. 
* Op. cit., p. ix. 
$ Op. cit., P. 45. 
© Op. cit., p. 44. 
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triadic relation symbol representing betweenness, and a tetradic relation 
symbol representing congruence. Consequently in any model of Hilbert 
Geometry there are two relations defined over the domain: a ternary relation 
Bet, and a quaternary relation Cong,. The elements of the domain A are 
thought of as points. Hilbert proved a representation theorem and a 
uniqueness theorem concerning models of Hilbert Geometry: 


[G]iven any model of Hilbert Geometry, there would be at least one function d 
mapping pairs of points onto the non-negative real numbers, satisfying the following 
‘homomorphism conditions’: 

(a) for any points x, y, x, and w, xy Cong, zw if and only if d(x, y) = d(z, w); 

(b) for any points x, y, and z, y Bet, xz if and only if d(x, y)+d(y, z) = d(x, z). 


[...] [If d, and d, are two functions mapping pairs of points into non-negative reals, 
both of which satisfy [conditions (a) and (b)], then d, and d, differ only by a positive 
multiplicative constant; and conversely, if d, and d, differ only by a positive 
multiplicative constant, then d, satisfies (a) and (b) if and only if d, does.! 


About these theorems Field has the following to say: 


Given these results it was easy to show that the standard Euclidean theorems about 
lengths would be true if restated as theorems about any function d meeting the given 
conditions. [. . .] [T]he fact that geometric laws, when formulated in terms of 
distance, are invariant under multiplication of all distances by a positive constant, 
but are not invariant under any other transformation of scale, receives a satisfying 
explanation: it is explained by the intrinsic facts about physical space, i.e. by the facts 
about physical space which are laid down without reference to numbers in Hilbert’s 
axioms. 


As far as it goes this is perfectly correct. As the concept of distance used by 
Euclid conforms to the conditions (a) and (b) we do indeed have an intrinsic 
explanation of the invariance of the laws of Euclidean geometry under 
Euclidean transformations. But there is still an element of convention in 
this, an element which Field apparently overlooks when he says: 


[T]he choice of one distance function over any other which differs from it by a 
positive multiplicative constant is completely arbitrary; it reflects in effect an 
arbitrary choice of units for distance. [. . .] What Hilbert did . . . was to explain, in 
terms of intrinsic facts about space which are statable without such arbitrary choices, why 
the chotce of functions to be invoked in the extrinsic theory will be arbitrary to precisely 
the extent that it is.3 


To illustrate this oversight consider instead of (b) the condition (b’): (b’) 
for any points x, y, and z, y Bet, xz if and only if d(x, y)/?+d(y, z)? 
= d(x, x)'/*,4 Given a function dy which satisfies (b) it is trivial to find one, 
d, say, which satisfies (b’)—let d, (x, y) = d,(x, y)?. Euclidean geometry can 
be rewritten in accordance with such a distance function. For example, 


1 Op. cit., pp. 26-7, cf. p. 50. 

2 Op. cit., p. 27, supressing mention of angles. 

3 Op. cit., pp. 45-6. 

“For the formulation at op. cit., p. 50, set dp(x,y) = Efa 1|9(x)—@,(y)| and change (a) in 
accordance with (b^) above. 


Hartry Field on Measurement and Intrinsic Explanation 343 


Pythagoras’ Theorem becomes: the length of the hypotenuse of a right- 
angled triangle is equal to the sum of the lengths of the other two sides. 

As a mathematical exercise it is easy to invent such surrogate distance 
functions. This would be of no importance if (b) represented an intrinsic 
constraint on distance functions, t.e. an intrinsic feature of length measure- 
ment. The truth of the matter is that it does not. Brian Ellis’s procedure for 
measuring using ‘diagonal inches’ satisfies (b’) and also satisfies ‘all the 
axioms for measurement that have been canvassed in the literature’. Scales 
differing from the dinches—diagonal inches—scale by only a multiplicative 
constant serve equally well and again we have an intrinsic explanation, this 
time of the invariance of the rewritten laws of Euclidean geometry under 
Euclidean transformations. The dinches scale is doubtless odd and un- 
natural but there is no condition statable only in terms of betweenness and 
congruence which prohibits it. [As orthogonality is definable solely in terms 
of congruence? its use in the procedure for dinch measurement is 
acceptable.] 

In effect the condition (b) places a constraint on the eight-place relation of 
differential congruence: the difference between the distance from x to y and 
the distance from x’ to y’ is congruent to the difference between the distance 
from z to w and the distance from 2’ to w’. More colloquially, the difference 
between lengths /, and } is the same as the difference between the lengths /, 
and /,. If one obtains in inches the difference in length between, say, two 
rigid rods by measuring their lengths then subtracting the smaller from the 
greater and finds it congruent to the difference between another pair usually 
the result disagrees with that obtained when the lengths are measured in 
dinches. (They are the same only if the pairs of rods are matched, i.e. two of 
one length, two of another, one of each in each pair.) Differential congruence 
is not independent of the measuring procedure by which it is ascertained. 
Condition (b) corresponds to the convention that differences in length are 
treated as distances themselves and their congruence ascertained directly, 
e.g., one can place rods side by side and measure the overlap of the longer 
over the shorter. Such a procedure fails for dinches: the directly measured 
difference in length between two rods (in dinches) is not equal to the 
absolute difference in length of the two rods (again measured in dinches). 
On the other hand, to someone brought up to use dinch measurements the 
following would appear a natural definition: (i) if/, and J, are congruent then 
their d-difference is zero; if /, is longer than J, then their d-difference is the 
length in dinches of the third side of the right-angled triangle having one 
side of length /, and the hypotenuse of length /,. It then turns out that the 
difference in length (measured in dinches) of two rods, say, is (the length of ) 
their d-difference. The proper dinch analogue of differential congruence 


1 Ellis [1966], pp. 80ff. 
2 Field, op. cit., p. 76. 
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would be d-differential congruence but this does not affect the indicated 
dependence on the measuring procedure. 

The conclusion may be stated thus: given a relation of differential 
congruence a scale of measurement is determined upto a multiplicative 
constant (i.e. a choice of unit) purely on intrinsic grounds. Condition (b) 
corresponds to a very natural differential congruence relation—the one used 
by Euclid—which is so much a part of our thought that it is tacitly 
presupposed in the usual mathematical description of Euclidean space, 
Euclid’s Laws. What is of the utmost importance here is that differential 
congruence is not fixed by the intrinsic facts, the facts about betweenness 
and congruence. Hence although Field has given us an explanation of why 
we measure the way we do and use the scales we do, he has not given us an 
intrinsic explanation. 


(b) Measurement of scalar magnitudes. Field proves representation and 
uniqueness theorems for scalar magnitudes such as temperature and 
gravitational potential somewhat in analogy to Hilbert’s theorems for 
Hilbert Geometry. Again we start with a model whose domain contains 
space (or space-time) points, and, apart from the spatial relations discussed 
above, if ọ is the scalar magnitude the model contains the quaternary 
relation of @-congruence and the ternary relation of m-betweenness. 
Intuitively these represent, respectively, the difference in p between points 
x and y is the same as the difference in p between points z and w; and, the Q at 
y lies between the ọ at x and the ¢ at z. The axioms for @ are supplied by the 
theory of measurement.’ If A is a domain, p-Bet, and g-Cong, three- and 
four-place relations respectively, then the representation and uniqueness 
theorems state: (A, p-Bet,, p-Cong,) is a model of the p axioms if and only 
if there is a function y from A onto an interval of the real numbers, such that 


(a) for any points x, y, and z, y o-Bet, xz if and only if 
w(x) < YO) < Yle) or H(z) < WO) < Yia); 

(b) For any points x, y, z, and w, xy pọ-Cong4 zw if and only if |ý (æ) 
—WO)| = lý (2)— y (w); 


if ý and y have A as domain and wW meets the conditions (a) and (b), so does 
y only if & = ay +b where a and b are real numbers. 

‘The theorem can be extended to the case where g-inferiority replaces g- 
betweenness as primitive (the ¢ at x is less than the at y). In this case the 
constant a must be a positive real number.? 

It is abundantly clear from Field’s discussion that he regards @- 
betweenness and -congruence as relations whose obtaining is an intrinsic 
fact. About m-betweenness there is no ground for complaint. p-congruence 
is quite another matter. A somewhat indirect approach is taken in the 
attempt to show that g-congruence is not intrinsic. 


1 Op. cit., pp. 57-8 
2 Op. cit., pp. 56-7 
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Let @ bea scalar magnitude which can be measured perceptually, at least 
approximately. Luminosity is an example. For a variety of physical 
magnitudes Fechner’s Law (sometimes the Weber-Fechner Law) is known 
to apply for a large range of values. Fechner’s Law is usually stated for a 
magnitude measurable on a ratio scale. Generalising to a magnitude @ 
measured on an interval scale the law can be stated: 


A(p,)—-A(@,) x (P —,) 


where A(,) represents the least perceptible increase in @ at py. As 9 
increases so does A(p). A perceptual scale can be constructed using these 
least perceptible increases as building blocks, other points being inter- 
polated. It is immediately evident that perceptual g-congruence— 
congruence of p-difference on the perceptual scale—does not fit in with p- 
congruence. (For any @, and @, in a region of the scale where Fechner’s Law 
obtains, A(p,) and A(@,) are perceptually congruent.) Substituting per- 
ceptual m-congruence for g-congruence does not affect the statement of the 
theorem. The W one obtains for perceptual @-congruence is more or less 
logarithmically related to the Ņ for @-congruence. Alternative perceptual 
scales are linearly related. 

Perceptual scales have their limitations—in particular a total failure to 
discriminate among extreme values. The appeal to Fechner’s Law was made 
in order to show the real possibility, even in a restricted range, of an 
alternative congruence relation. (Real possibility as opposed to mathema- 
tical juggling.) The moral is that p-congruence is not independent of the 
manner in which it is ascertained. And so, it seems, it is not the case that p- 
congruence is intrinsic. 

There is some evidence in favour of these conclusions in that ‘odd’ scales 
have been used in practice. Dalton produced a temperature scale logarithmi- 
cally related to the absolute scale. In statistical mechanics it is often 
convenient to use a quantity inversely proportional to the absolute 
temperature. Similarly, to characterise wave phenomena one may use the 
wave number rather than the wave-length. Although the latter two cases 
may be regarded as uses of mathematical shorthand rather than genuine 
commitment to non-standard scales, a more telling case is familiar from the 
history of probability theory. Specific gravity and specific density are 
inverses of each other. Here especially it makes sense to say that intrinsically 
they do not constitute a pair of scalar magnitudes but rather that there is a 
single physical magnitude which can be measured in each of two ways. On 
the other hand, it is impossible to maintain this if, say, specific density 
congruence is intrinsic because, by the uniqueness theorem (given a 
particular spatial differential congruence relation), the map x > x7! lies 
outside the class of permissible transformations of scale. 

‘These examples help show that there is rather more convention in science 

-than Field allows and, correspondingly, intrinsic explanations of why we 
measure the way we do are rather harder to come by. Indeed if there are 
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intrinsic facts which determine scales of measurement upto the extent 
required for the representation theorems to serve as the bases of such 
explanations there must be intrinsic facts which the use of ‘odd’ scales would 
lead us to get wrong. These facts must go beyond spatial betweenness and 
congruence, and betweenness for scalar magnitudes. It is my belief that 
there are no such facts, a belief based simply on the inability to see any thing 
wrong with dinches, with Dalton’s temperature scale, with specific density 
and specific gravity viewed as essentially—intrinsically—the same thing. 
Certainly there is no obvious principled way of eliminating the putative 
elements of convention disclosed above. 


PETER MILNE 
London School of Economics 
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WAS DESCARTES’S COGITO A DIAGONAL DEDUCTION? 


In ‘Descartes’s Diagonal Deduction’ [1983] Peter Slezak presented a novel 
interpretation of the cogito. A similar interpretation was later independently 
advanced by William Boos in his [1983]. Professor Slezak’s interpretation 
was foreshadowed in his [1982] criticism of Lucas’ attempt to refute 
mechanism by an appeal to Godel’s theorem (see Lucas [1961]). Slezak 
believes that his account of the cogito may help to turn the tables on Lucas by 
showing how Godel’s theorem can be construed as supporting rather than 
refuting mechanism. Although I am sympathetic to Slezak’s overall project 
of reconciling the ‘deep intuitive implausibility’ of materialism with its 
philosophical merits, I have doubts about whether his interpretation of 
Descartes’s cogito serves this purpose. The basic difficulty is that Slezak’s 
interpretation, in effect, assigns ‘Buridan sentences’ a legitimate role in 
Descartes’s philosophy. The paradoxical nature of these sentences would 
have the peculiar result of undermining Descartes’s cogito while enabling him 
to disprove God’s existence. Although I believe that essentially the same 
objection faces Boos, I shall restrict my discussion to Slezak’s article. 
Slezak provides two reconstructions of Descartes’s cogito. One portrays 
thoughts as mental representations or internal models. Here the cogito is cast 
as an attempt to delete these inner representations. The insight of the cogito 
is that this deletion process cannot be complete for the deletion process 
cannot be applied to itself. To make the logical form of the argument more 
explicit, Slezak elaborates with the help of a second reconstruction. It 
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proceeds through the familiar device of representing a person’s corpus of 
beliefs as a list of propositions to which he would assent. Descartes’s beliefs 
might thus run: 


(1) Grass is green 
(2) Roses are red 
(3) Snow is white 
ete. 


Descartes’s radical doubt can then be represented as denial of each member 
on the list. These denials yield another list of propositions of the form: 


(x) I doubt (n) 


where n is the number of propositions in the initial list. Universal doubt 
would require that (x) itself be doubted. But when n = x, the following 
proposition results: 


(x*) I doubt (x+). 


Given that we follow Slezak in reading ‘doubt’ as ‘believe to be false’, (x) is 
self-defeating. For if I doubt (x+), it is thereby true. In this sense, (x*) is 
indubitable. 

Slezak goes on to point out the parallel between (x+) and the liar paradox 
and other diagonal arguments such as Russell’s paradox and Godel’s 
theorem. The liar purports to deny its own truth: 


(n) (%2) is not true. 


In effect, Godel’s sentence denies its own provability. Some commentators 
on the Knower’s paradox, 


(K) No one knows this sentence, 


have maintained that (K) is an ordinary language version of Godel’s theorem 
(see Tymoczko [1984]). In turn, (K) has been thought by some to be 
crucially involved in the Hangman paradox (see Kaplan and Montague 
[1960] as well as Nerlich [1961]). One: of the reasons why Slezak’s 
interpretation of the cogito is so interesting is that it places the cogito in a 
family of puzzles and theorems that are of great current interest to 
philosophers. I think Slezak is correct in emphasizing the parallel between ‘I 
doubt this sentence’ and the liar. However, this parallel is too close for 
comfort. 

Familiarity with the resemblance between sentences like (x*) and the liar 
dates back to at least the fourteenth century as is evident from Jean 
Buridan’s Sophismata. Recently, Tyler Burge in [1978] and [1984] has 
dusted off these Buridan sentences in order to advance the study of the liar. 
Consider 


(a) It is not the case that I believe (a). 
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Sentence (a) is paradoxical insofar as it appears to force me into endless 
vacillation. If I believe (a), then I should not believe (a) because my belief in 
(a) ensures its falsity. On the other hand, if I do not believe (a), I should 
believe it because my lack of belief ensures its truth. (a) is anti-incorrigible; I 
must be mistaken about it (whether commissively or omissively). 

Of course, my vacillation will not actually proceed endlessly. For I will 
eventually tire of thinking about (a). At that point, an outsider can know 
whether (a) is true by checking whether (a) is amongst the set of things I 
believe. If (a) is a member of the set, (a) is true. Otherwise, (a) is false. 
Although I cannot know whether (a) is true, people other than me can know. 
For them, it is a decidable empirical question. 

The anti-incorrigibility of (a) is surprising. But surprises in themselves 
are not philosophically interesting. What is philosophically interesting is the 
exposure of inconsistent beliefs which are responsible for the surprise. 
Given that (a) is semantically satisfactory, suspicion falls on some of our 
epistemic principles. There are three candidates. First, there is a principle of 
self-awareness; one is aware of one’s doxastic states. In other words, if one 
believes p, then one believes that one believes that p, and if one neither 
believes nor disbelieves p, then one believes that one neither believes nor 
disbelieves p. Second, there is the principle of deductive closure; if one 
believes that p, then one believes all of the consequences of p. Third, there is 
the principle of direct consistency; one cannot both believe a proposition ` 
and believe its negation. Few philosophers believe that these principles 
apply to ordinary people. The principles are only intended to apply to ideal 
thinkers. The idealization is motivated by the fact that our normative 
standards for appraising beliefs appear to embody these principles. In any 
case, let us now turn to the question of how an ideal thinker would respond 
to a sentence such as (a). Unlike me, the ideal thinker does not tire out or 
make logical errors. His freedom from epistemic flaw generates a contradic- 
tion. Either he believes (a) or he does not. If he believes (a), then his self- 
awareness guarantees that he also believes that he believes (a). The ideal 
thinker’s deductive closure then guarantees that he infers from his belief in 
(a), the falsehood of (a). So the ideal thinker will both believe and disbelieve 
(a). But this violates the ideal thinker’s requirement of consistency. Now 
suppose that the ideal thinker does not believe (a). Self-awareness ensures 
that he will believe that he does not believe (a). Deductive closure ensures 
that he will then infer from this lack of belief in (a), the truth of (a). The ideal 
thinker will then both believe and not believe (a). So once again, a 
contradiction arises. We should therefore conclude that such an ideal 
thinker is impossible. (For a formal version of this argument, see Burge 
[1978].) 

The anti-incorrigibility of (a) makes it an inadequate foundation for 
knowledge. However, (a) corresponds to an interpretation of Cartesian 
doubt in which ‘doubt’ is read as ‘either disbelieving or suspending 
judgement’. Since Descartes observes that the fundamental certainty of the 
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cogito emerged ‘whilst I thus wished to think all things false’, Slezak 
maintains that the intended reading of ‘doubt’ is ‘disbelieve’. Does this 
make a difference? Consider 


(b) I believe (b) is false. 


If I believe (b), then (b) is not true, so I should not believe (b). On the other 
hand, if I believe (b) is false, then (b) is true, so I should believe (b). Once 
again, the objection to disbelieving (b) is paralleled by an equally strong 
objection to believing it. So this reading also leaves the cogito unserviceable. 

In addition to rendering the cogito unserviceable, Slezak’s interpretation 
would undermine Descartes’s theism. For if Buridan sentences are semanti- 
cally satisfactory for use in the cogito, they can also be used to establish the 
logical impossibility of God’s existence. 

In ‘Some Neglected Problems of Omniscience’ [1983], Patrick Grim has 
us consider different versions of the ‘Divine Liar’: 


(Dr) God believes that (Dr) is false. 
(D2) God believes that (D2) is not true. 
(D3) Sentence (D3) is not on the list of things God believes. 


Grim defines ‘omniscience’ as a matter of knowing and believing all and only 
truths. So Grim is able to derive a contradiction from the supposition that 
(D1) is true. For (Dr) can only be true if God is mistaken about (Dz). Grim is 
also able to derive a contradiction from the supposition that (Dr) is false. For 
if (Dr) is false, God makes an error of omission by failing to believe that (Dr) 
is false. So if (Dr) is either true or false, God is not omniscient. Given that 
omniscience is an essential feature of God, a priori atheism results. If we 
insist that (Dr) is neither true nor false, Grim points out that similar 
scepticism about omniscience can be achieved by means of (D2) or (D3). 
Although Grim regards these sentences as variations of the liar, we can 
more conservatively view them as Buridan sentences. The only noteworthy 
difference between these sentences and (x*), (a) and (b) is the substitution of 
‘God’ for ‘T’. This substitution does not alter the semantic adequacy of the 
sentence. One advantage of recasting Grim’s argument as an appeal to the 
Buridanean nature of (Dr)+D3) is that our suspicions about the liar make 
Grim’s original argument look more sophistical than it need appear. The 
apparent empirical nature of Buridan sentences makes it more difficult to 
discount them as semantically deficient. Indeed, the peculiarities of Buridan 
sentences may be assimilated to those associated with the sentence which 
puzzled Moore: ‘It is raining but I do not believe it’. For example, if we 
grant that Buridan sentences express propositions, as their empirical nature 
seems to require, they qualify as Moorean sentences under the definition 
suggested in Sorensen [1985]. A second advantage of recasting Grim’s 
argument in Buridanean terms is that Grim’s insight can then be easily 
embedded in the thoughts of a Slezakian Descartes. We can picture 
Descartes wondering how God would respond to ‘I do not believe this 


350 Roy A. Sorensen 


sentence’. Since ‘I’ now refers to God, the proposition expressed will be 
equivalent to one of Grim’s ‘Divine Liars’. 

Given that Descartes viewed the cogito in the way Slezak describes, it is 
fairly clear as to how he should have resolved his wonderment. If God 
exists, then he is an ideal thinker. Since knowledge implies belief, God’s 
omniscience guarantees self-awareness and deductive closure. If we define 
‘omniscience’ as merely knowledge of all truths, we have not ruled out the 
possibility of an inconsistent omniscient being. We could rule out this 
possibility with another definition, as Grim does. However, there is no need 
to do so since God’s consistency is conceded by theists. Theists attribute 
omniscience to God because they view God as the best conceivable being. If 
God were not omniscient, we would be able to conceive of a better being 
than God; one who had more knowledge. Likewise, it can be argued that 
God is perfectly consistent because otherwise we would be able to conceive 
of a better being than God; one free of inconsistencies and the inevitable 
errors inconsistency implies. So God is an ideal thinker. But it has already 
been shown that ideal thinkers are impossible. Therefore, God’s existence is 
impossible. 

Of course, most contemporary philosophers would have reservations 
about this argument for a priori atheism. For most would extend their 
doubts about the semantic status of the liar sentence to Buridan sentences. 
However, a Slezakian Descartes would not be in a position to shelve the 
atheistic argument on semantic grounds. If Buridan sentences are suf- 
ficiently well understood to play a crucial role in the construction of a certain 
foundation for all of Descartes’s knowledge, they cannot embody semantic 
mysteries which disqualify the above argument for the logical impossibility 
of God’s existence. What’s fair game for the cogito is fair game for thea priori 
atheist. 

But we should recall that the cogito itself was not without a hitch for 
Slezak’s Descartes. Although it is true that I cannot coherently doubt ‘I 
doubt this sentence’, it is also true that I cannot coherently believe it, or even 
suspend judgement about it. If I cannot believe the sentence, then I cannot 
know it. Therefore, it cannot be my epistemological Archimedean point. So 
if forced to choose between this version of the cogito and theism, Descartes 
would have lost little by choosing theism. 

Overall, I think Slezak’s interpretation of Descartes’s cogito leaves 
Descartes vulnerable to embarrassing objections. Whether or not the 
objections succeed, they do at least raise questions. Yet Descartes displays 
little interest in answering these questions. Rather than concluding that 
Descartes overlooked these points, it is more likely that the objections are 
irrelevant to the cogito that Descartes had in mind. Since the objections are 
relevant to the cogito Slezak had in mind, I think we should conclude that 
Slezak’s cogito is not Descartes’s cogito. 

ROY A. SORENSEN 
University of Delaware 
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BEN-ZEEV ON THE NON-EPISTEMIC 


Aaron Ben-Zeev, in a recent article in The British Journal for the Philosophy of 
Science, explored the philosophical origins of the notion of a passive level of sensory 
registration distinct from an active perception level. In his closing section he 
endeavoured to show the inapplicability of such a notion. In this discussion paper it 
is argued that in his claims he has ignored the latest developments in the theory of the 
non-epistemic field, with respect to both logical and empirical considerations. In 
particular, he has failed to recognize the evolutionary significance of a non-epistemic 
level of sensing. 


F. H. Bradley once characterized the progress of philosophy as a ‘continual 
attempt to escape from the fallacy of the false alternative’ (Bradley [1914], 
p. 238). Aaron Ben-Zeev has presented his readers with a false alternative in 
his recent article in these pages on the Passivity Assumption of the Sensa- 
tion-Perception Distinction (Ben-Zeev [1984b], 327-34). One must 
either declare oneself an old Sense-Datum theorist, trapped in an inex- 
plicable dualism of object and appearance, or a sane objectivist, who can see 
through all the tortuous ambiguities of the Representationalist’s case to 
the real thing. 

It appears that, because the neonate can show discriminations concerning 
distance, form, size constancy, and so on, it is empirically dubious to 
postulate a passive level of sensing for the neonate, and hence for everyone. 
Another empirical proof he adduces is that of the Constancy Phenomenon: 
the objective colour of an object is perceived under a considerable variety of 
lighting conditions, and, hence, there can be no private subjective element, 
in particular, in the form of some cortical display in the brain. Like the 
ecological psychologists, with whom Ben-Zeev has some sympathy (Ben- 
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Zeev [1984a]), he finds the notion of inner displays counter-intuitive as well 
as illogical, and holds to the commonsense view that perceptual objects are 
in the environment. He follows the early Collingwood in declaring that 
without some notion of cognitive discrimination it is self-contradictory to 
postulate a sensed field of distributions. These ‘mental entities’ —his mode 
of classifying representations—cannot be placed on any parameter which 
connects with quantitative measurement. Nonsensical, too, to claim that a 
meaningful percept could be constructed out of meaningless material: the 
raw material must be relevant. There cannot be two ‘stuffs’, one perceptual 
and the other physical. He follows Toulmin [1981] in asserting that the 
Greeks were more modern than the empiricists of the seventeenth and 
eighteenth centuries, for they, like Gibson, argued for a close link between 
observer and observed, instead of falling into the trap of assuming a 
metaphysical gap between perceptual and physical existents. The empiricist 
motivation in his view was connected with the attempt to establish a causal 
chain between external object and internal percept, which, for the tough- 
minded amongst them, was part of the general aim of arriving at a materialist 
account of mind. 

Given Ben-Zeev’s own presuppositions and his conception of the target 
he is attacking, most of the arguments seem to hang together and appear to 
force one into the choice he proposes. Nevertheless, the alternatives are not 
now confined to two. New Representationalism does not make the same 
claims as Sense-Datum Theory and the objections that hold against that 
theory have no purchase on what has replaced it. Yet it still has at its core the 
hypothesis that Ben-Zeev is attacking, namely, that there exists a level of 
non-epistemic sensing for all sensory modes which is distinct from the 
epistemic sorting that is performed upon it, but its version of that hypothesis 
has a character invulnerable to the charges made, and one that has 
disturbing consequences for Ben-Zeev’s own position. 

To begin with a distinction which Ben-Zeev and others (ecological 
psychologists, ‘homuncular functionalists’, artificial-intelligence research- 
ers) have failed to make. In one of his recent articles, Ben-Zeev ([1983], 
63) mentioned the Common Tick as having a ‘cognitive system’ because it 
can make appropriate responses in the service of its life- and species- 
maintenance. The Common Tick (Ixodes rhicinus) has a remarkable dual 
ability to detect the presence of an animal such as an ox under the branch on 
which it hangs, to fall upon the animal, and, once attached to the hide, to bite 
it. The discovery has been made that the abilities are due to two innate 
releasing-mechanisms: the dropping from the branch is triggered by the 
presence of butyric acid in the air (present in the perspiration of the animal) 
and the biting by a temperature of 37°. These automatisms, however, cannot 
be credited with the term ‘knowledge’; even to use the term ‘information’ for 
aspects of the operation of such a system is likely in the philosophical context 
to lead a harmless metaphor into trapping the unwary into false conclusions. 
Should it happen that butyric acid was exuded by a predator of the tick, the 
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behaviour would be maladaptive and beyond correction. Konrad Lorenz 
([1977], p. 55) points out that the amazement one feels at the sight of a 
recently born turkey chick cowering at the presence of a bird of prey in flight 
above it in the sky is much reduced when the same cowering behaviour is 
elicited by the sight of a large fly crawling across the ceiling. Human beings 
also have such automatic responses outside voluntary control. Take, for 
example, the eyes’ reflex adjustment to dark conditions, the so-called ‘night- 
vision’: no one is going to credit the eyes with a cognitive system. Where a 
stimulus produces a reflex motor response there is no necessity for there 
being any inspectable sense-field in the operation of that reflex; soon first- 
level robots will be able to perform complex automatic behaviours in 
response to external stimuli, but there will be no need to credit them with 
sense-fields. However, like the Tick, they will not be able to correct 
responses that are maladaptive should the stimulus prove to be ambiguous. I 
have dubbed this inadequacy in such first-level systems the ‘Sorcerer’s- 
Apprentice’ failure (Wright [1985b]). The Broom that the Sorcerer’s 
Apprentice ordered to fetch water went on fetching the water when the 
cistern was full. However sophisticated such reflex responses in robot or 
primitive organism, no one is going to honour them with the term 
‘epistemic’ if in entirely new untoward circumstances they lack the ability 
even to stop the maladaptive behaviour. 

In Ben-Zeev’s formulation there is a linear increase from the ‘cognitive’ 
system of the Tick to that of man. By his analysis ‘information’ means the 
same all the way from the reflex act to the epistemic choice. However, the 
appearance of sense-fields in the course of evolution and the pain/pleasure 
systems which accompanied them could be looked upon under another 
hypothesis (see further Wright [1985b]) as permitting a significant change to 
be brought about in motor reactions to effects produced by external 
influences (the first evidence of nociceptors appears in fossil remains of 150 
million years ago). The Tick example makes it clear that what its primitive 
system responded to was not the object the ox (though that is what human 
beings would identify as the source of what is life-maintaining for the tick) 
but to a characteristic molecular input, in which no epistemic recognition is 
involved—no tick has the concept Butyric Acid. In a similar way the sense- 
organs of an advanced animal respond, not to objects, but to energy- 
distributions or molecular states (light-wave distributions into the eyes, 
sound-wave intensities at the ears, ionization-states for taste-buds, tension- 
states in bodily tissue, etc.). Neural impulses from these are led, under the 
hypothesis we are considering, not to direct motor-control, but to 
presentation-fields. At the instigation of pain (characteristically caused by 
damage to the animal) or of pleasure (caused by life- or species-maintaining 
behaviour) molar sortings are made from those presentation-fields. Many 
animals have instinctive molar sortings already provided, but the key 
advantage is that the learning of new percepts not before operative in the 
system is now possible. What would be unalterable and maladaptive in a 
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system that had reflex connections between external input and motor 
response, is now replaced by a flexible process. It is important to note that 
the new percepts established by a pain/pleasure regime do not necessarily 
take on the reflex character: in the higher animals adjustment of those 
percepts, however well ingrained, remains a possibility. In some lower 
animals, for example, it is only during certain maturational periods that 
learning can take place, after which the percepts do take on an unalterable 
character. Man’s advantage over the learning animal is that he can help to 
alter another’s percepts by means of language. Some animals can pass on 
learned behaviour on to their offspring but only by a performance of that 
behaviour in the presence of the distribution that caused the valuable 
percept (as of a chimpanzee passing on the straw-method of extracting 
termites from a nest by actual performance at such a nest). Man can adjust 
another’s percepts in the absence of that which causes the input. 

What is vital in the sensory system is that the sense-field, apart from a few 
set responses (as of the child to the breast) and from any pre-phenomenal 
processing (such as—for the visual field—enhancement of edges, sharpen- 
ing of contrasts, steadying of backgrounds during head or eye movement), 
must contain no given molar sorting that is linked to the pain/pleasure 
system, Though an instinctive pre-phenomenal process may enhance the 
contrast of an edge, there is no guarantee that that edge will be recognized as 
one of a particular object. 

Here is reached the key conflation made by Ben-Zeev. He equates sensory 
discrimination with epistemic identification. It is certainly true that any part 
of the field that is discriminated in the presentation could, if need be, serve in 
an identification, but not all discriminated regions and variations do so 
serve. A may have better sight than B when they both look at the tree trunk, 
but B identifies the camouflaged moth immediately that is not discerned by 
A. David Kelley ([1980], 403-5) has pointed out that there are at least 7.5 
million possible discriminata in the colour solid. One can agree with Joseph 
Runzo ([1982], 207) that any of these could in principle be appropriated for 
epistemic recognition purposes, but it is patent that, although they are sensed 
and perhaps even digitalized at the neural level, they are not all pro- 
positionalized at the epistemic level. One can add a further empirical fact: 
observers differ in their }ND’s— ‘Just-Noticeable-Differences’, so that one 
agent can distinguish colours, changes of pitch or of temperature, etc. that 
another cannot. In a dire situation where the help of a person with a greater 
refinement of JND would get one out of danger, would one try to maintain 
that there was no non-epistemic difference between your red and his? One 
could establish that a neonate had such discriminata at the level of sensing, 
but it would not imply that each discrimination was attended with a percept 
at the level of epistemic selection. 

There is an important logical distinction to be made here, already pointed 
out by J. B. Maund ([1975], 48-9) and Virgil C. Aldrich ([1980], 55-6) 
between ‘field-determinateness’ and ‘olject-determinateness’. Take the 
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example of a TV screen. The picture is produced by matrices of phosphor 
cells on the back of the CRT screen, three intersecting matrices for the three 
colours. For any state of the screen (‘live’ broadcast, video-tape, cartoon 
film, computer-excited distribution, interference, or combinations of 
these), a digitalized statement, that is, in the form of quantifications about 
the point-states of the matrices can be empirically given without any 
reference to what the screen’s distribution might be described as showing 
(say, some ‘snow’-interference, or a picture of the Louvre, or a ‘Wordstar’ 
menu). Let a randomized computer pattern be on the screen. To one person, 
who happens to be playing the equivalent of a ‘faces-in-the-fire’ game in the 
manner of a Rorschach test it looks like an eagle; to a literate Chinese it 
happens to resemble the ideogram for ‘water’; to a Nigerian it looks 
remarkably like an initiation-mask of the Yoruba tribe; etc. What each takes 
it to be a picture of is irrelevant to the quantifications of the point-state 
distribution. The propositionalizings on this field-level are obviously 
unconnected with the propositionalizings at the object-level. The former is 
its ‘field-determinateness’, which can be specified without reference to what 
is being objectified from it. As Aldrich puts it, it is vital to distinguish 
between the description of the state of the representing field from what is 
represented in it. As to what is represented upon a TV screen, there can be 
discussion with appeal to relevant circumstances: this is ‘object- 
determinateness’, which is clearly on another logical plane from that of the 
point-states of the field. 

Transfer the account to the sensing and perceiving system. The point- 
state level of the sensory field could conceivably be specified by a 
neurophysiologist of the future, or at least enough of it to prove the case. The 
propositionalized description would contain no reference to any objects a 
particular brain might or might not be imputing to the distributions of the 
field. The field is non-epistemic vis-a-vis object-descriptions while it 
remains scientifically describable at the level of a field-determinate descrip- 
tion. It may be objected that no neurophysiologist as yet is able to make such 
a description or even know how to go about it, but the fact that point-state 
descriptions are necessary without reference to objects can easily be 
demonstrated. 

Take the Double-Image situation. I hold up a “Tippex’ bottle before my 
eyes and do a convergent squint (the direction of the two eyes converging 
more sharply than for normal superimposition of the right eye’s and the left- 
eye’s fields). It becomes obvious that my right eye can see a red patch on the 
label while my left eye cannot, since it is looking from further round. 
According to an old-fashioned Objectivist like Gilbert Ryle or even a 
contemporary one like George Pitcher ([{1971], p. 41), I am merely seeing 
‘one thing twice’. Now according to an ‘object-determinate’ account I am, 
but what am I seeing according to a ‘field-determinate’ one? The fact that 
my right eye can see what my left eye cannot (and vice versa) brings out the 
empirical fact that the two fields are not the same. For a scientist it becomes 
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perfectly plain that the distinction between the visual fields of the right eye 
and the left eye cannot possibly be accounted for by the statement that I am 
looking at the same “Tippex’ bottle from two different angles, for the whole 
of each of the two fields is infected by a difference of distribution. Every book on 
the shelf beyond is differently viewed; indeed, the light-wave distribution 
into my right eye is markedly different from that of my left eye at every point. 
To try to collapse an adequately quantified account of that field-difference 
onto ordinary-language remarks like ‘I am merely seeing one thing twice’ 
now proves to be laughably misconceived, to be—to borrow a favoured 
antique phrase of Ryle’s—a ‘howler’. If now a point-state description at a 
field-determinate level is given of those two fields in order to account for this 
difference in an adequate propositional form, there will be no reference 
whatever to ‘““Tippex’”’ bottle’, ‘books’, ‘shelves’ or any other object-term 
that I might wish to apply to my visual field. It becomes logically clear that, 
in order to experience a non-epistemic difference that is not specified in 
object-terms, all Ben-Zeev has to do right now is to close his left eye, then 
open it and close his right eye. And furthermore, when the 3-D experience 
returns with both eyes open, how many persons are aware of the sensory 
difference? In a recent rough survey in a school of some classes of children 
between the ages of 11 to 13, I discovered that less than 25% of the children 
were consciously aware of any difference in their visual field when looking. 
with one eye only, The 75% named objects perfectly well, but were 
completely unaware at the conscious level of the 3-D phenomenon of the 
everyday experience. Some even went on protesting that they could perceive 
no difference whatsoever, though it was highly likely that their stereoscopic 
system was in perfect order. Here an aspect of the non-epistemic was 
continuously present and yet was entirely unconceptualized. 

As regards the Constancy Phenomenon, one has only to refer Ben-Zeev 
now to Edwin Land’s investigations (Land [1977]) to cast doubt on the 
externality of the experience. By the New Representationalist’s view of the 
matter, this is a further proof of the internality of the experience since it now 
seems likely that cortical colour is produced by computation across the field 
of the relative intensities of three light-frequency bands. As regards 
Gibsonian processing, I have argued elsewhere that it is a logical error to 
believe that, because neural processing takes place upon input, the existence 
of a cortical phenomenon is ruled out (Wright [1983c]). Trying to prove the 
absence of subjectively sensed qualia through the existence of neural 
processing is logically equivalent to trying to prove the absence of the 
phosphor glowings on the back of a CRT screen on the ground that the set 
happens to contain circuits that process the input signal before it gets to the 
screen (such as the ‘Brightness’, ‘Colour’, and ‘Contrast’ controls). 

It was unwise of Ben-Zeev to quote the Collingwood of 1923 
(Collingwood [1923]) because by 1938 Collingwood, influenced no doubt by 
Russell, Moore and Broad, had completely changed his opinion. Indeed, a 
well-argued defence of the non-epistemic can be found in The Principles of 
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Art (Collingwood [1938]). There he even argues for the real existence of the 
field when it is presenting ‘illusory sensa’. He puts the case that the field 
remains real~as-a-field whatever it shows: 


A real sensum means a sensum correctly interpreted; an illusory sensum, one falsely 
interpreted. And an imaginary sensum means one that has not been interpreted at all; 
either because we have tried to interpret it and failed, or because we have not tried. 
These are not three kinds of sensa, nor are they sensa corresponding to three kinds of 
sensory act. Nor are they sensa, which, on being correctly interpreted, are found to 
be related to their fellows in three distinct ways. They are sensa in respect of which 
the interpretative work of thought has been done well, or done ill, or not done at all. 
([1938], p. 194) 


Ben-Zeev has shown himself unscientific about one kind of ‘illusory sensa’, 
namely dreams (which, according to Collingwood, are as real in sensory 
terms in the visual field as any open-eye view, just as a cartoon on TV exists 
in real phosphor glowings). In an earlier article (Ben-Zeev [1983], p. 59) he 
apportions dreams to a ‘different kind of reality’. Unless he accepts 
Collingwood’s explanation, dreams will remain occult in his proposals. 
Collingwood was also aware that not all sensing is accounted for by the 
percepts we apply to it. Those who find it difficult to accept the notion of the 
non-epistemic forget that there are autistic children. As Collingwood says, 


looking is different from seeing and listening from hearing (Collingwood [1938], 
p- 203) 


Compare this with what a psychologist who works with autistic children 
says about them: 


Though they can hear and see, it appears that they may not listen and look. 
(Hermelin [1976], p. 137) 


But one’s body does not have to be that of an autistic child in order to 
experience non-epistemic experience, as has already been seen in the 
‘looking with the right eye, then with the left’ point above. In absolute 
darkness whether the eyes are open or closed is of course irrelevant, but the 
question to ask is ‘What is one experiencing?’ The Objectivist who answers 
‘Nothing’ is not being scientific, for one is experiencing Black, a sensory 
state. Locke himself knew that there was a positive state of the visual field for 
this ‘privative’ condition of it (Essay, II, viii, 6), and Lord Brain recognized 
the fact (Brain [1951], pp. 10 and 15). If Ben-Zeev argues that the concept 
‘Black’ can still be applied to it, one can agree while reminding him that 
many a neonate experiences Black before it can apply any such concept. 
‘There are more empirical cases that can be quoted to prove the existence of 
unconceptualized sensed discriminata but reference to them will here suffice 
(Wright [1983a, 1983b, 1985a]). 

A fitting conclusion is to recall the evolutionary need for a non-epistemic 
field. The presentational medium is the result of a combination of causes, 
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among which, in the case of open-eye seeing, are an incoming colourless 
energy-distribution, neural processing (together with any interference 
caused by failures in the system) and the condition of the presentational 
medium itself. What aspects of those causes are selected for purposive 
attention is another matter. If neural processing (e.g. edge-enhancement) is 
innate and unalterable, fortunately percepts are not. This is the key 
advantage of the higher animals. Instead of the draconian evolutionary 
solution of the maladapted creature dying off and the chance mutation with 
instinctive adaptation surviving, the higher animals can change their 
percepts. Instead of bark, they now see their camouflaged prey, and there 
was not even any co-referentiality between one object and another in that 
switch of gestalt. It is vital then that the sensed fields are not tied to object- 
selection of any kind, but it is precisely this freedom to change what shall be 
selected as ‘an’ object that produces the evolutionary advantage. 

It is essential that observation begin with ‘meaningless materials’ if the 
needful readjustment of percept is to remain a possibility. It is no use Ben- 
Zeev pointing to the neural processing as supposedly tainting the non- 
epistemic field. All those adjustments are innate and unalterable, open in the 
contingency of circumstance to becoming maladaptive at the very next life- 
challenge. They are only the evolutionary residue of successful species- 
interplay with nature; they, like the field itself, are entirely non-intentional. 
All the active percept-choosing element lies in the perceiving module; the 
non-epistemic remains utterly passive from the knowledge point of view, 
however sophisticated the reflex systems involved. Ben-Zeev’s (and pre- 
sumably Toulmin’s) characterization of ancient philosophy as supportive of 
their view conveniently ignores the Pyrrhonian strand, the opponents of the 
Platonic Academy. Sextus Empiricus was correct in claiming that sense- 
perception began in an ‘involuntary affection’. 

Finally, if Ben-Zeev and other Objectivists still wince at the thought of 
colours in the brain, there is a particular re-ordering of understanding thatis ` 
required. By the New Representationalist hypothesis, colours are not 
features of the surfaces of molecules. Therefore they cannot be the features 
of the surfaces of molecules in the brain either, but they can still be neural 
experiences produced by complexes of molecules in the brain. Sensed 
colours are not looked at externally; no sensed red is part of the surface of a 
pillar-box, not even projected there in Jackson’s magical fashion (Jackson 
[1977], pp. 102-3). Similarly, no cortical colour can be looked at; by 
definition it cannot, but it can be experienced, as Stephen C. Pepper argued 
(Pepper [1961]). The Homunculus/Vicious Regress Objection to inner 
sensings falls away, for sensed colours are not looked at by the brain. The 
Solipsist Objection and Austin’s Illusion/Delusion Objection also do not 
have any hold upon this new position (Wright [1983a, 198sa]; for a general 
survey, see [1984]). The time has come for the neurophysiologist to search 
for the way in which colours, sounds, smells, feels, hots, colds, pains and 
pleasures are realized in the neural circuitry. And it would be a crude 
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mistake if, in looking for structures in which red was being experienced, he 
was on the watch for a portion of the brain that looked red to him. 


EDMOND WRIGHT 
Pembroke College, Oxford 
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A CHALLENGE TO THE FOLLOWERS OF LAKATOS 


Imre Lakatos [1970] in his paper ‘History of Science and its Rational 
Reconstructions’ made the following points: 
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It will be argued that (a) philosophy of science provides normative methodologies in 
terms of which the historian reconstructs ‘internal history’ and thereby provides a 
rational explanation of the growth of objective knowledge; (b) two competing 
methodologies can be evaluated with the help of (normatively interpreted) history; 
(c) any rational reconstruction of history needs to be supplemented by an empirical 
(socio-psychological) ‘external history’. 


But the methodology of research programmes draws a demarcation between internal 
and external history which is markedly different from that drawn by other rationality 
theories. For instance, what for the falsificationist looks like a (regrettably frequent) 
phenomenon of irrational adherance to a ‘refuted’ or to an inconsistent theory and 
which he therefore relegates to external history, may well be turned in my 
methodology internally as a rational defence of a promising research programme. 


For instance there may have been an experiment which was accepted tstantly—in 
the absence of a better theory—as a negative crucial experiment. For the falsifi- 
cationist such acceptance is part of the internal history; for me it is not rational and 
has to be explained in terms of external history. 


I propose to discuss a historical example in which an experiment was 
accepted instantly in the absence of a better theory and challenge the 
followers of Lakatos to show that it was not rational and also produce 
evidence of socio-psychological phenomena. 


DAVY’S EXPERIMENTS 


According to the theories of Lavoisier which dominated Chemistry 
1790-1810 an acid was the oxide of a non-metallic element. The acid formed 
by distilling common salt and sulphuric acid (now known as hydrochloric 
acid) was called muriatic acid and regarded as a compound of oxygen and an 
unknown element murium. The green gas evolved when muriatic acid was 
warmed with manganese dioxide was called ‘oxy-muriatic acid’ and was 
considered to be a higher oxide of murium. 

In 1810 Davy tried to decompose ‘oxy-muriatic acid’ into oxygen and 
murium. He burnt phosphorus in the gas expecting to obtain a white solid 
(phosphorus pentoxide) but the products were’a volatile liquid and a solid 
quite different from phosphorus pentoxide. Similarly sulphur did not 
produce any sulphur dioxide or trioxide but another liquid. Charcoal when 
heated in the gas had no reaction even at a high temperature. 

Because no oxygen could be extracted from ‘oxy-muriatic acid’ and 
because all the products obtained between ‘oxy-muriatic acid’ and the 
known elements sulphur and phosphorus weighed more than the original 
quantity of gas, Davy concluded that ‘oxy-muriatic acid’ was also an element 
and named it chlorine after the Greek word for pale green. 

Because ‘muriatic acid’ can be synthesised exclusively from hydrogen (an 
element) and chlorine (shown to be an element) it follows that ‘muriatic acid’ 
is a compound of hydrogen and chlorine, contains no oxygen and should be 
renamed hydrochloric acid. Lavoisier’s Theory of Acids had therefore been 
falsified. 
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Ihde [1980] states: 


‘The theory of oxygen as an acidifying principle was totally demolished by Davy’s 
work on chlorine in 1810. 


And Partington comments: 


Lavoisier’s oxygen theory of acids had now gone the way of phlogiston. (Partington 
[1964], page 55) 


One or two very weak oxygen-free acids had been isolated prior to Davy’s 
researches but because of their weakness had not been regarded as seriously 
damaging to Lavoisier’s theory. If however the well known strong acid 
muriatic acid had to be reclassified as the oxygen free hydrochloric acid then 
Thde’s reference to demolition is entirely appropriate. 

It is a well known psychological fact that some people take a lot of 
convincing and it is a matter of historical record that one leading chemist 
refused to be convinced for seven years, although admitting that he was out 
of step with everybody else. Berzelius had just finished extending 
Lavoisier’s oxygen theory into a comprehensive ‘electro-chemical’ or 
‘dualistic’ theory of acids, bases and salts. In the best Lakatosian traditions 
he was not prepared to give up a promising research programme for the sake 
of a single empirical result. All his contemporaries indulged in what Lakatos 
describes as an ‘irrational act’: they accepted that the existence of a 
STRONG oxygen-free acid had refuted Lavoisier’s theory according to the 
logic of the modus tollens (Partington [1964], page 167). 

A large number of ad hoc and generally unsatisfactory theories were 
devised in the theoretical vacuum following the demise of Lavoisier’s 
theory. None of them fulfilled the criteria satisfied by the oxygen theory 
prior to its refutation as regards generality, details of prediction and 
precision of formulation. The chemical world had to wait till 1838 for the 
meritorious successor theory. 

Leicester [1956] writes the following (my italics): 


When Gay-Lussac studied hydroiodic acid in 1813, he admitted the correctness of 
Davy’s views. Thus Davy gave the deathblow to Lavoisier’s theory of the 
composition of acids. 

An actual understanding of the essential nature of acids came from the work of 
Graham and Liebig later in the century. Thomas Graham showed that ortho-, pyro-, 
and metaphosphoric acids were distinct substances which in his formulation, 
contained three, two or one molecules of water that could be replaced by a 
corresponding number of equivalents of base. Justus Liebig generalised this in his 
theory polybasic acids, showing that organic acids existed that could combine with 
various equivalents of bases. He therefore assumed that acids were compounds of 
hydrogen and that this hydrogen could be replaced by metals. 


CONCLUSION 


For a period of 25 years (from Gay-Lussac’s acceptance of Davy’s work in 
1813 to Liebig’s paper in 1838) it was accepted by leading practitioners that 
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Lavoisier’s Theory of Acids had been refuted in the absence of a better 
theory. Chemists simply went about their business and waited for a better 
theory to materialise. This is in accordance with a Popperan interpretation 
of ‘internal history’; it is up to the followers of Lakatos to show that this 
episode was ‘not rational’ and part of the so called ‘external history’. 


F. MICHAEL AKEROYD 
Bradford College, Bradford 
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Reviews 


WATKINS, JOHN [1984]: Science and Scepticism. Hutchinson. Pp. xii+ 387. 
£25.00. (ISBN 0-09-158010-2.) 


John Watkins is admirably straightforward about his aims in Science and 
Scepticism: “To succeed where Descartes failed: to submit our knowledge of 
the external world to an ordeal by scepticism and then, with the help of the 
little that survives, to explain how scientific rationality is still possible . . . to 
find an answer to Hume, but one that accepts the validity of, and is not 
vulnerable to, his central negative thesis’, an answer which makes no use of 
transcendental arguments, or assumptions about the simplicity of nature or 
our attunement to nature, or illicit inductive use of the probability calculus. 
Both in his overall aim of combining probability scepticism with a denial of 
rationality scepticism, and in the detailed execution of that aim he is firmly 
Popperian, though, as he says, and as we shall see, he departs sharply from 
Popper in finding appeals to verisimilitude covertly inductive, and in 
producing a type of justification for the statements we use as our empirical 
basis. I would add to this that there is in Watkins a far greater stress on both 
the certainty of statements about one’s current perceptions and on their role 
in scientific rationality than I find in Popper. 

Let me say at the outset what a pleasure I took in reading Sctence and 
Scepticism. Its clarity, its honesty and its refusal to shirk the difficult 
questions, as well as the care, detail and ingenuity with which key notions are 
developed are all wholly admirable. So too are the discussions of significant 
moments in the history of both science and philosophy. As a full and 
sustained account of why scepticism about empirical knowledge is a genuine 
problem for real philosophers, it could hardly be bettered. Much of the 
ground covered here is naturally enough familiar in outline (though Watkins 
often adds new and engaging insights), but some is not. Here I would draw 
attention particularly to the ingenious argument in the extremely detailed 
chapter on probabilism, which shows that Hintikka’s attempts to base a 
probabilistic confirmation of scientific laws on the assignment to the 
constituents of those laws an a priori probability based on the probability of 
its being true in a completely random universe with a certain specific 
number of individuals would actually have the effect of lowering the a 
posteriori probability of those laws, even though the evidence actually 
appeared to favour them, unless we assume in making our probability 
calculations that the number of individuals in the universe is actually less 
than the number we have observed. Doing this, however, would not only be 
unreasonably counterfactual in itself; it would also have the effect of making 
the least falsifiable empirical hypothesis the least probable, which is to 
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offend against what Watkins calls a paramount principle of probability. The 
coup de grâce is administered when it is realised that the least falsifiable 
hypothesis and the one that will get most increase of probability on 
Hintikka’s system when evidence starts to come in, if we obey the paramount 
principle of probability, is the hypothesis that denies that there are any 
universal laws at all. So actual evidence will decrease the probability of any 
lawlike hypothesis whatever! 

Greater probability in scientific theories is for Watkins an unrealisable 
aim. At the same time he rejects the attempts of Mach and his followers to 
rescue a certain and confirmable core from the statements of science, in the 
shape of statements about perceptual phenomena (which Watkins follows 
Hume in exempting from sceptical doubt). Watkins sees Machian and other 
reductionist accounts of science as overemphasising the aim of truth at the 
expense of depth and explanatory power in scientific explanation. 
Moreover, he argues both that a science that did not implicitly rely on 
general laws and theories would be incapable of making many of the singular 
predictions science can actually make, and that tests of such theories cannot 
be made on purely phenomenal premises. Instead of making an un- 
obtainable and illusory certainty an aim of science, Watkins argues initially 
for depth, unification of theories, and predictive power and exactitude as 
aims of science. The most substantial chapter of the book elucidates these 
aims in considerable detail, epitomising them as demands for increasing 
depth of explanation and width of empirical coverage. Increase of empirical 
content is involved in both these demands, clearly enough in the second 
case; in the first case, it came about because deeper theories entail empirical 
generalisations of greater content than their rivals. Comparisons of content 
are, of course, problematic, and Watkins introduces the notion of an 
incongruent counterpart to deal with this. In essence, two statements are 
incongruent counterparts if their respective consequence classes match each 
other; any consequence of one has an equally informative counterpart, 
congruent or incongruent among the consequences of the other, and vice 
versa. In this way, Watkins avoids certain well-known problems with 
statistical, question-answering and potential falsifiability criteria of content, 
at least in cases where two theories are equally informative, or where one 
goes beyond and revises the other, and makes a genuine contribution to 
discussion of content-comparison. He also has interesting things to say 
about the apparent reversals of content comparisons due to difference of the 
languages in logically equivalent propositions are expressed, and about 
when the deep theoretical cores of theories play a genuine role in the 
derivation of observational consequences. 

Let us, however, assume that Watkins is successful in elucidating the 
depth pole of the scientific enterprise. Is he able to show that depth in a 
theory in his sense has anything to do with its truth? If he cannot do this, 
then his claim to have defeated rationality scepticism will be empty, for he 
will not have shown that idealised scientific practice has any rational basis as 


Reviews 365 


far as relying on its predictions goes. Into his optimum aim for science, in 
place of the inductivist aim of proven truth, Watkins suggests the following: 


Science aspires after truth. The system of scientific hypotheses adopted by a person 
X at any one time should be possibly true for him, in the sense that, despite his best 
endeavours, he has not found any inconsistencies in it or between it and the evidence 
available to him (pp. 155-6). 

The evidence against which theories are tested is provided by the empirical 
basis. Watkins argues forcefully against any cut-and-dried notion of what 
this consists in and also (more problematically, in my view) against saying 
that any part of the empirical basis can be taken to be conclusively justified. 
All that we can be certain of are our immediate perceptions, and these are not 
enough to prove anything about the empirical world conclusively. 
Nonetheless, in Watkins’ view we can rationally accept as our empirical 
basis those physical object statements that best explain our perceptions, 
unless and until we find better alternatives to them, or later evidence tells 
against them. Though Watkins does not say this, explanation of this sort is 
immensely strengthened in the case of ordinary physical object talk by 
virtue of the fact that no alternative to it, as an explanation of most of our 
perceptions, is even conceivable; the problems that dog explanationism at 
higher levels due to multitudes of competing explanations do not realisti- 
cally arise here. There is, of course, a strong disagreement between Watkins 
and Popper over the empirical basis. Watkins sees rightly that Popper’s 
Friesian trilemma provides no solution at all to the problem, but only an 
infinite regress of derivations of tests, unless we regard at least one of our 
empirical test observations as itself rationally acceptable (and its acceptance 
not just a matter of dogma or psychology). 

Does the rational acceptability and explanatory power of a statement in 
our empirical basis mean for Watkins that it is true or likely to be true? I 
must admit that I am unclear about this. Certainly Watkins does not say that 
such a statement ts true but only tested against perceptual experience and 
quasi-rationally accepted. If he does not think we have grounds for calling 
such a quasi-rationally accepted statement true, I am not clear how far he has 
solved the Friesian trilemma. What is certain, though, is that he thinks that 
corroborations of higher level theories relative to the empirical basis provide 
no reason at all to speak of anything more than their possible truth. 
Moreover, he is resolute and, in my view, correct in emphasising that talk of 
verisimilitude in their regard is covertly inductive. Far from Lakatos 
needing in 1974 to urge Popper to introduce a whiff of inductivism into his 
philosophy, in Watkins’ view it was already there in Popper’s original 
accounts of verisirmnilitude as giving us reasons to believe that certain 
theories are nearer to the truth than others. 

Why, then, do corroborations matter? Does Watkins succeed in defeating 
rationality scepticism? I am afraid that my answers to these questions are 
unfavourable to Watkins’ case. Indeed, I am not persuaded that on his view 
one is entitled to speak of a theory as being well-corroborated at all, given 
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that this notion depends on knowing when we have submitted a theory to a 
severe test. But, without assuming that the world will be similar to the way it 
was previously, how do we know that a simple repetition of an old test is not 
just as severe as a completely new one? Watkins says (p. 296) that we know a 
priori a theory will be at greater risk from an empirically novel test than a 
merely repetitious one. But, that it is merely repetitious in the sense that 
might make it less severe is just what we cannot conclude without inductive 
assumptions. Similar difficulties arise, I think, when one attempts to pin- 
point just where a set of theories has gone wrong when a test they jointly 
entail goes against them; here Watkins refuses to treat some of the set as 
epistemologically superior, and apportion blame, correctly on his prin- 
ciples. But I am not clear that his solution to the severe test problem does not 
equally depend on assumptions about greater certainty of previously tested 
theories. Even if we could arrive non-inductively at corroboration- 
assessments, however, we still have to ask why they matter if we are not 
allowed to say that they are some guide to truth. Although Watkins rejects 
Goodman’s paradox (on the surely erroneous ground that unlike truly 
scientific theories, the proposition that all emeralds are grue does not 
permit us to make an inference from an observation of an emerald made now 
to a conclusion about that emerald’s colour after AD 2000), he also rejects 
David Miller’s claim that it is rational to act on a well-corroborated 
hypothesis because we have no reason to suspect that it is false unlike its 
refuted competitors. But, as Watkins argues, it is more than likely that there 
will be a plethora of unrefuted and incompatible alternatives all consistent 
with past evidence. Why, then, act on the best-corroborated of these? 
Watkins argues that, without saying that a well-corroborated theory, as 
opposed, say to one that predicts some dramatic change in the near future 
course of events, is true or likely to be true, we can still say that it is rational 
to act on the standardly well-corroborated one because, not predicting a 
sudden dramatic change but postulating the continuation of already 
observed regularities, it is the weaker hypothesis, and, at least as far as action 
goes, it is rational to choose a weaker as opposed to a stronger hypothesis. 
Now, I can see a sense in which a well-tested, well-corroborated hypothesis 
is weaker, relative to past evidence, than some very bold and novel 
competitor. But, without some assumptions regarding natural stability, it is 
very unclear that weakness in this sense has anything to do with reliability, 
let alone with truth. 

As so often with non-inductivist attempts to rationalise science, I felt left 
high and dry at this point. The aim of science is to aspire after truth, but we 
cannot say anything about the truth of specific theories. Corroborations 
matter, but only, it seems to me, given that a weaker hypothesis is likely to be 
more reliable than a stronger one, and this surely depends on inductive 
assumptions. As Watkins himself says (p. 352) any discussion of the aims of 
science must take account of the massive development of science since the 
seventeenth century. I cannot help feeling that this development of science 
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since the seventeenth century must have something to do with the actual 
acquisition of truth, and not just with the striving for it. Aside from this, 
however, I cannot see how rationality-scepticism can be defeated without 
assuming that we sometimes do get to the truth, and also have reasons for 
thinking we do. I make these points here not as a narrowly negative 
criticism of John Watkins’ book, but precisely because I cannot imagine a 
more cogent, and fully articulated presentation of the attempt to combine 
probability scepticism with a denial of rationality scepticism. 


ANTHONY O’HEAR 
University of Bradford 


GELLNER, ERNEST [1985]: Relativism and the Social Sciences. Cambridge 
University Press. £22.50. Pp. viii+200 (ISBN 0-521-26530-4) 


This is a thoughtful and thought-provoking collection of essays which, in 
the words of their author, ‘play their part in a single, coherent, though as yet 
incomplete endeavour’. If and when it is completed, it will be clear how far 
in his view different civilisations can be compared, not only from the point of 
view of any one of them, but also in accordance with some absolute standard. 
As matters stand at present, Professor Gellner is prepared to argue that our 
civilisation can claim absolute cognitive superiority in the fields of science 
and technology, but that no such claim can be made in other fields such as 
philosophy or sociology. 

The purpose of the following remarks is to outline the manner in which 
Gellner contrasts positivism, which is inspired by science, and 
‘Hegelianism’, which is inspired by moral and social thought; to give 
examples of the way in which his account of the contrast throws light on 
some specific issues; and to express a criticism which seems relevant to his 
general undertaking. Positivism, as understood by Gellner, implies among 
other things that genuine knowledge, t.e. scientific knowledge, is cumulative 
and progressive, that normally it can only be expressed in a special idiom, 
that its data must be considered as isolated atoms, which together constitute 
a single non-holistic or ‘granular’ world. Hegelianism, on the contrary, 
implies among other things that the pool of Ideas is finite. It finds the Idea of 
Eternal Return congenial and has a strong tendency towards assuming the 
interdependence of issues after the fashion of Hegel’s dialectic. Gellner’s 
discussion of the contrast includes an interesting interpretation of Popper’s 
philosophy, some aspects of which are claimed to be positivistic. The upshot 
of Gellner’s investigation is that ‘the positivists are right’ but ‘for Hegelian 
reasons’ —because, if I understand him rightly, positivism cannot provide 
the tools which are required for the attempted legitimation of a philo- 
sophical position, while Hegelianism is flexible enough to fill this need. 
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Whereas man’s pursuit of science implies, according to Gellner, the 
assumption of a single world which is open to rational investigation, the 
everyday human world lacks this Cartesian unity. It is best described in 
Wittgensteinian fashion as a complex of ‘plural and incommensurate, 
overlapping games’. To be involved in these games is not to investigate 
them, but to endeavour to avoid gaffes. The relationship and interaction 
between man the rational and man the gaffe-avoiding animal needs, as 
Gellner emphasises, further enquiry. What he does not doubt is that, 
because science is based on the assumption of a single world, it is 
incompatible with relativism. He also holds that while this single world was 
made by one kind of man, it is accessible to all human beings and that in fact 
it is gradually being adopted by all of them. Science in Gellner’s sense does 
not include the social sciences. They do share some features with natural 
science, but they lack the characteristic of involving an overall consensual, 
intellectual activity which is radically discontinuous from ordinary thought 
and unambiguously cumulative. 

About a third of the collection of essays is devoted to exegetic questions. 
One of the essays contains an interpretation of structuralisme, as he rightly 
calls it. This interpretation is meant to avoid, and succeeds in avoiding, the 
drawbacks of popular explanations which ‘fluctuate between the unin- 
telligible and the obvious with a heavy list towards the former’. According to 
Gellner structuralism involves a return to a generative model of expla- 
nation. It implies that the regularities which can be discerned in the 
phenomena emanate from hidden permanent forms or essences. He 
exemplifies this interpretation by reference to linguistics, mythology, ritual 
and symbolism and points to some serious limitations of the structuralist 
approach. The penultimate essay of the collection is a review of Cooking, 
Cuisine and Class by Jack Goody which I found interesting in itself and 
which made me curious to read Goody’s book. The last essay contains a 
critique of Kripke’s interpretation of Wittgenstein’s theory of concepts, in 
particular of the so-called consensus or communal view of the nature of their 
application. This is the thesis that the correct application of a concept 
depends on the social consensus of its users. Gellner produces some weighty 
arguments against this view. They include the comment that the consensus 
view fails to explain why some individuals have on occasion felt justified in 
defying the consensus and why in some cases they were subsequently 
vindicated. 

Among the thoughts provoked by the collection of essays are, of course, 
some objections. One objection, which seems fundamental to me, concerns 
the positivistic assumption that science is free from ‘non-granular’ assump- 
tions about reality as a whole. Against this it can be argued that science is not 
free from so-called ‘immanent metaphysical principles’, i.e. principles 
which are not part of the subject matter of a scientific investigation, but are 
among its presuppositions or, as I would put it, partly determine its 
categorial framework. Thus the principle that natura non facit saltus is an 
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immanent metaphysical principle of Newtonian and Einsteinian dynamics, 
while its negation is an immanent metaphysical principle of Bohrian 
quantum mechanics. The truth or falsehood of either principle cannot be 
determined by experiment or observation. Yet éach of these principles— 
through the theories containing it—serves the cumulative extension of 
descriptive and prognostic information. Indeed, even though the principles 
are contradictories, the acceptance of only one of them by all contemporary 
scientists would significantly reduce the available store of descriptive and 
prognostic information about natural phenomena. It goes without saying 
that this objection and others—e.g. a possible objection to Gellner’s view of 
the irrelevance of the speculative doctrine of free will to scientific thinking— 
do not affect the value of the collection of essays as a contribution to the 
philosophical and anthropological discussion of the problem of cognitive 
relativity. 


S. KORNER 


HOOKWAY, CHRISTOPHER (ed.) [1984]: Minds, Machines and Evolution. 
Cambridge University Press. Pp. xi+177 (ISBN 0-521-26547-9) 


Minds, Machines and Eevolution is a frankly ad hoc collection of eight sand- 
alone essays, each having been given in a series of conferences held by the 
Thyssen Philosophy Group during 1981-2, each essay dealing with either 
evolutionary biology or with computation, and most dealing with naturalis- 
tic accounts of the mental. 

They range from the editor’s own paper, a survey of the work of Peirce, 
Quine and Donald Campbell, which is, as the editor says, ‘more purely 
philosophical than the others’ (p. viii), to John Maynard Smith’s contri- 
bution, the most purely unphilosophical paper, a tutorial on kin selection 
and evolutionary game theory. 

The papers by David Hull and Elliot Sober fall pretty squarely into the 
philosophy of biology. Hull’s is a heavyweight and heavy-going essay which 
opposes the covering-law model of explanation in biology and the social 
sciences. Sober defends causalism in the philosophy of biology much as 
Nancy Cartwright does in the philosophy of physics. He argues that 
selection is a force and fitness a disposition and that neither is reducible to 
their ceteris paribus effects. 

Like David Hull’s, Neil Tennant’s encyclopaedic essay covers a lot of 
ground: facts and theories about animal communication, philosophical 
theories of syntactic structure, imagined evolutions of language and 
thought. His conclusions: that there can be unstructured talk without 
thought, that there had to be thought before structured talk and that talk 
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with structure, in particular recursive structure, generates more structured 
thought. 

Yorick Wilks on the other hand has some unusual ideas. In his Machines 
and Consciousness he argues that intelligent machines have four 
characteristics—the software that runs on them is modular, is 
implementation-independent (or portable), is relatively high-level and 
cannot be reconstructed from the machine code into which it is compiled or 
interpreted (that is, the interface between language levels is opaque). 
Consciousness has similar features he says. The low, physiological levels of 
intelligent behaviour are generally invisible to the higher, mental levels. 
And so Wilks argues for a ‘programming language’ model of consciousness 
according to which consciousness is blind to its implementation in the brain 
in the same way that a high-level language is ‘blind’ to the machine-code it 
compiles to. ‘The “highest level of program”, he writes, ‘is the best 
available explication of consciousness’ (p. 125). 

A cognitive wheel, Dennett tells us in his essay, the best in the volume, is 
‘simply any design proposal in cognitive theory (at any level from the purest 
semantic level to the most concrete level of “wiring diagrams” of the 
neurons) that is profoundly unbiological, however wizardly and elegant it is 
as a bit of technology’ (p. 147). You don’t find wheels in Nature though they 
are archetypally elegant as technological artefacts. Dennett suspects, and 
here he is a little too tentative, that ‘any AI system is and must be nothing 
but a gearbox of cognitive wheels’ (p. 147), that a workable AJ system may 
be no more like a human mind than a wheel is like a leg. 

Dennett’s starting-point is the frame problem: how do we, or intelligent 
machines, or planning systems in general, choose and pursue relevant 
inferences and ignore the myriad irrelevant ones on which we would 
otherwise waste our time? This is a tough problem for automated reasoning 
programs. It is also a deep epistemological problem and not only for AJ, 
though it is one that philosophers have managed to ignore. A merit of AI is 
that it has forced the philosophical frame problem into the open. Dennett 
suspects in particular that the current efforts to solve the frame problem, 
using Minsky’s ‘frames’, Schrank’s ‘scripts’, McDermott and Doyle’s ‘non- 
monotonic logic’ may well be just more cognitive wheels. This seems to me 
to be right but here Dennett leaves us dangling. 

And finally there is the contribution of Margaret Boden who, covering 
some of the puzzles touched on by Tennant, begins with a cartoon 
illustrating the problem of how kingfishers fish: do they know about 
refractive indices and Snell’s Law and all that? Boden is informative, as 
always, about the work of Marr and Ullman on perception and Hayes on 
knowledge representation, all of which she wants to believe to be other than 
more cognitive wheels. But the real philosophical meat of her essay is her 
arguments that they are not cognitive wheels, or should at least be taken not 
to be. Appealing as they do to the underdetermination of theory by evidence 
general to science and to Popperian tolerance of fruitful if ultimately 
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misguided research programmes, her arguments seem to me to be rather 
weaker than her survey of recent work in AJ. 


PETER GIBBINS 
University of Bristol 


BUTTS, ROBERT E. [1984]: Kant and the Double Government Methodology. 
D. Reidel. Pp. xvi+ 339. (Hardback £20.25) (ISBN 90-277-1760-5). 


Butts uses the expression Double Government Methodology (DGM) to 
describe a methodological duafism introduced by Leibniz for explaining 
reality. It comprises a mechanical method (Mr) for the study of the world of 
bodies and a metaphysical method (M2) for understanding the world of 
essences and ends. No matter how different, however, Mr and M2 are 
meant to be complementary and mutually justified. 

Since this DGM rests ultimately upon Leibniz’s metaphysical doctrine 
that reality is a continuum of non-extended simple substances (monads) that 
are internally teleologically oriented, the fact that nature is to be investigated 
mechanically depends on the imperfection (to be accounted for by meta- 
physics) of human knowers. From a metaphysical viewpoint M2 is superior 
to Mr. 

Butts maintains that Kant too accepted a DGM. He too believed that the 
natural world is to be subjected to Mr and that one needs to complement the 
latter with a non mechanistic approach to reality. But he did not identify this 
approach with M2 as envisaged by Leibniz nor did he believe that M2 
should be considered superior to Mr. This attitude was dictated to Kant by 
his concern with ‘the state of the supersensible’ which was, according to 
Butts, ‘his central problem throughout his philosophical career’ (p. 5). This 
concern was connected (A) with claims of clairvoyance; (B) with the 
problem of the reliability of the credentials of metaphysics, especially with 
respect to the mind (soul)-body connection. 

Now, since Leibnizian monads are soul-like substances whose sole 
activity is representing the universe to themselves, every ‘object’ of monadic 
activity is a merely represented something. Indeed every such ‘object’ is 
something supersensible and ‘each monad is a ghostseer; each monad has the 
power of extrasensory perception’ (p. 20). Obviously Kant could not accept 
monadology and, at the same time, not only criticise the clairvoyant 
Swedenborg, but also condemn any fanaticism (Schwdrmeret) in morality 
and religion founded on pretences of a private access to truth or God and 

* classifiable on a par with mental derangements. 

According to Butts it was in the course of some investigations on mental 
health (1764) that Kant found a good argument for rebutting the claims of 
clairvoyance. For, having connected mental disorders with disoriented outer 
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sensation, he could later on (1766) establish external sensation as a decisive 
criterion of evidence that admitted as a datum only what is publicly 
available. Certainly this was not sufficient to dispense completely with 
metaphysics: since he wanted to allow for human freedom, Kant still needed 
a metaphysical explanation of the way a soul (having no spatial location) 
could act on bodies. 

After tentative solutions (such as that spirits have ‘virtual location’) the 
first Critique succeeded in eliminating the problem: (1) the soul is ‘where’ 
the body is; (2) the ‘T’ that one connects to recollections of past events is just 
what happened to the body variously positioned in space. This 
solution—that affected Kant’s later medical-philosophical speculations 
where in fact mental disorders are classified in terms of behaviour—meant 
the end of any concessions to Leibnizian metaphysics in both Mz and M2. 
Moreover Mr was shown to be not only sufficient for the study of the natural 
world, but also the preferred human way of knowing it. 

The rejection of Leibniz’s metaphysics as well as of all simple-minded 
empiricism, however, called for a new justification of the objectivity of Mz 
and of the applicability of mathematics to experience. This justification was 
supplied by Kant’s ‘constructive idealism’ whose meaning Butts in- 
vestigates through an analysis of schematism. 

Very appropriately Butts distinguishes the schematism of categories from 
that of empirical concepts. Categorial schemata are not meant to produce 
images: they are general rules for providing examples of what may be 
experienced. Just as space-time and categories constitute the grammatical 
framework of knowledge, categorial schemata function as semantical rules 
relevant to deciding (A) the applicability of categories and (B) the requisites 
for what is to be an object that can be understood within the semantics of the 
system of schematised categories. In short, the sphere of application of the 
system of schematised categories coincides with the phenomenal world. 

Within this system the schemata of empirical concepts function as rules in 
accordance with which we produce images corresponding to those concepts. 
These schemata, therefore, have also a heuristic value: if confronted with the 
alleged reality of some supersensible object (e.g. a soul) we have only to ask 
ourselves whether we can schematise that object. 

As for the question of mathematics Butts stresses that mathematical 
constructions are not mental pictures. What is important for Kant is the 
operation of construction, which is a public ‘transaction’ (p. 183) that 
enables us to generate a rule for constructing a certain mathematical object. 
This means that mathematics is not free from the constraints of the 
conceptual framework (space-time and categories) and of the system of 
schematised categories. Sentences of pure mathematics must be applied to | 
their constructions; sentences of applied mathematics must be ‘objectified’. 
But the objectivity of applied mathematics is guaranteed exactly by Kant’s 
constructive idealism according to which we make the world of possible 
experience, a world whose objects must be constructed as measurable, a 
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world in part invented, but (differently from the world of dreams or of 
ESP) the only one we can know. 

Such a world is the world of science, a world of machines. So much so 
that, as Butts shows, Kant subordinates the use of hypotheses in science to 
prior assurance that what is hypothesised lies within the limits of possible 
experience. This same world, however, can be investigated with intent. But 
for this kind of investigation Mz is inadequate and must give way to a 
different method. For neither space-time nor the system of schematised 
categories but ideas of reason can order our expectations that knowledge will 
form an organised world. 

Mz2 as conceived by Kant, therefore, does not fall back on metaphysics, 
but introduces system and purpose in the investigation of the world. This is 
particularly evident in the third Critique where the Principle of reflective 
judgement (PR) states that to study nature as a set of systematically related 
laws we must presuppose that nature is a logical system. The urge of reason 
to understand nature as a system, described in the Transcendental Dialectic 
of the first Critigue, becomes an urge to impute design on nature as a 
principle for making it understandable. 

This appeal to purposiveness is not in contrast with Kant’s constructive 
idealism. As Butts argues, it is because we ‘design’ objects that we think 
certain features of nature as designed. And it is because we are constrained to 
postulate a cause operating designedly that we are entitled to think 
designedly about nature. Mz and M2 (as in the case of Leibniz) are 
complementary and mutually justified. 

Moreover, if, according to the solution of the third antinomy, we consider 
both mechanism and design as regulative maxims, we may accept either in 
specific cases by pragmatic success. Is thus teleology introduced in the study 
of nature? Butts gives a negative answer to this question by pointing out that 
teleological explanations (such as that of organisms) are for Kant only 
analogous to causal ones: we attribute design to an external cause just as we 
think of ourselves as causal agents. Even the third Crttique’s appeal to a 
‘supersensible substratum’ (God) is no more than an appeal to the 
intelligibility of nature as a guarantee that whatever system of explanation 
we will prefer it will be successful (rational). 

Kant’s DGM is then really double, but belongs to a unitary system. A 
system that Butts sees as one that ‘defines normal or healthy attempts to 
know’ (p. 282) in compliance with Kant’s belief that a philosopher’s duty is 
to prevent mental derangements by prescribing an illuministic diet capable 
of prolonging a human life free from the dangers of Schwdrmeret. 

The account given above gives only an idea of the complexity of the book. 
Among the topics it touches upon, but that it was impossible to discuss in 
detail, are: the influence of gnosticism on Leibniz; the correct interpretation 
of Träume eines Getstersehers (1766); Euler and Sommering’s theories of the 
location of the soul; the significance of the third antinomy; Kant’s 
anthropological and medical interests. The wealth of themes treated and of 
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arguments developed is so impressive that Butts not only admits of leading 
the reader through ‘tortuous and energy-sapping paths of philosophical and 
historical analysis’ (p. 223), but requires also that the reader should be 
capable of supplying himself some of the connections that the book does not 
explicitate (p. 317). The effort, however, will not be wasted: the book makes 
a very rewarding reading. 

All the same two critical remarks seem appropriate. 

(1) Butts apparently overlooks the importance of Kant’s Wolffian 
education for his relation to Leibniz. Indeed one can detect in the book a 
tacit but definite assumption that Wolff (and perhaps even the variegated 
Wolffian school) can be identified with Leibniz. But nothing proves how 
wrong this identification is more than the influence exerted by Kant’s 
speculations about the mind (soul)—body problem on his ‘departure from 
Leibniz’. For this problem had gained paramount importance in 
eighteenth-century German metaphysics thanks to some Wolffian scholars 
particularly near to Kant (Knutzen, Baumgarten, Meier) who had shifted 
the emphasis from the orthodoxly Leibnizian pre-established harmony, 
ruling general monadic relations, to the theory of influxus physicus, used to 
explain soul-body relations. This shift of emphasis, that followed Wolff’s 
superimposition of dualistic Cartesian elements on Leibnizian metaphysics, 
was taken for granted by Kant who already in his Gedanken of 1747—as 
Butts acknowledges (p. 91}—undoubtedly considered influxus physicus as 
‘the’ philosophical problem. The neglect of this historical aspect of Kant’s 
attitude towards metaphysics does not impair the general thesis of the book 
about Kant and the DGM, but somehow impoverishes it, especially in view 
of Butts’ usual attentiveness to historical details. 

(2) The analysis of Kant’s philosophy from the perspective of an 
overturning of Leibniz’s DGM rather than from the usual one of the 
Copernican Revolution is very convincing and, at times, illuminating. Too 
little, however, is said in the book about the role of empiricism in the making 
of that philosophy. True, Butts mentions on occasion the fact that Kant was 
too much of an empiricist for accepting ESP or Leibniz’s view of the 
phenomenal world. But Kant’s empiricism was not simply a matter of his 
philosophical taste. As is universally known, Kant had Hume (who is never 
mentioned in the book) in the highest esteem, even when disagreeing with 
him, and certainly did not believe all empiricism to be simple-minded. So 
much so that if fear of fanaticism prevented him from being a true 
Leibnizian, it was fear of scepticism that prevented him from being a 
convinced empiricist. Thus Butts is absolutely right in maintaining that the 
teleology of PRI is the Kantian substitute for Leibnizian metaphysics. But 
it seems also evident that PRY is Kant’s solution to the problem of induction. 
For, while reflective judgement is the capacity of concluding from particular 
to universal, PR? guarantees that our inductive inferences are justified. To 
this extent Butts’ view that for Kant we can think of nature as designed 
because we are ourselves designers (and vice versa) could be used to explain 
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how Kant could account for the legitimacy of empirical laws (and even 
empirical concepts) without yielding to Hume, but surely having paid the 
utmost attention to his arguments. 

As is easily seen, both critical remarks are not directed against the book, ` 
but intend to point out a sort of one-sidedness that often characterises 
serious and ambitious books. For this is a serious and ambitious book, and 
one which can be strongly recommended. 


MIRELLA CAPOZZI 
University of Siena 


KANE, JEFFREY [1984]: Beyond Empiricism: Michael Polanyi Reconsidered. 
Peter Lang, New York. Pp. 263. (ISBN 0-8204-0118-8) 


Polanyi, one of the great scientists of the mid-century, had qualified 
originally in medicine but switched to physical chemistry after the First 
World War. He did highly original work in chemical kinetics and molecular 
structure. Thus, when he turned to the philosophy of science, after the 
Second War, he brought to his studies a personal experience which gave him 
a very thorough understanding of how science is actually done. 

His main philosophical works, notably Personal Knowledge and The Tacit 
Dimension, were published in 1958 and 1967 respectively and were thus 
somewhat in advance of the great outburst of study on the comparison of 
mental activity with the performance of computers. They do, I think, have 
something very useful to contribute to A.J. research, as well as being of quite 
general philosophical interest. 

As is well known, he gave much emphasis to the tacit component of 
knowledge; as he put it, ‘we know more than we can tell’. For example the 
physician is not able to specify precisely how he recognises a case of epilepsy; 
he gains that ability only through practise. Similarly, during any skilled 
performance, such as piano playing, the player focuses his attention on the 
whole composition and not on the individual notes; indeed if he were to 
think separately about each key he is pressing his performance ‘would 
become paralysed! And again we are able to recognise a person even though 
we cannot enumerate the individual features of his face. During recognition 
we are focally aware of the joint meaning of those features, as being a 
particular face, even though the separate features are known only in a 
subsidiary and tactt manner. 

Polanyi developed these ideas in a way which has a bearing on the creation 
of scientific theory. When the scientist considers some phenomenon, he 
brings to it all sorts of tacit assumptions and presuppositions about the 
nature of reality. These are only subsidiarily known-to him; they are not 
explicit and are not at the focus of his attention. The presuppositions, 
together with items of empirical data, nevertheless provide clues to the 
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problem his attention is focused on. The tacit/explicit distinction and the 
subsidiary/focal distinction thus play important parts in Polanyi’s epistem- 
ology. In each situation there is a triadic relation between: (a) a focal object; 
(b) a set of subsidiaries; (c) the ‘knower’ who brings the subsidiaries to bear 
on the focus. 

Dr Kane’s book is a good summary of Polanyi’s thought and he also makes 
some original points of his own. In his first chapter he contrasts Polanyi’s 
epistemology with Popper’s. Since for the former the content of the mind is 
not wholly explicit, scientists cannot commit themselves solely to rational 
principles; what they do is to (irrationally) commit themselves to un- 
formalised notions about the nature of reality. Thus whereas Popper’s 
conception of objective scientific knowledge presumes the elimination of 
personal components, Polanyi believed this to be misguided. Even the 
refuting data which scientists might apply to falsify a theory, in accordance 
with Popper’s model, requires personal judgment (p. 22). 

Chapters 2 and 3 are concerned with the triadic structure of knowledge, as 
mentioned above, and also with Polanyi’s point that entities can be subject to 
physico-chemical laws and yet not be fully understandable in terms of those 
laws alone. This applies very clearly to human artefacts. Our various 
devices, such as computers, are fully subject to natural laws and yet are not 
specifiable in terms of those laws. They must also satisfy some operational 
principle—that which says what the particular device is intended to perform. 

Polanyi sought to extend this idea to living creatures, including humans. 
Although their bodies obey natural laws, they can be comprehended only by 
attributing to them a purposiveness which cannot be made explicit at the 
physico-chemical level. The intentional aspects of human behaviour 
therefore escape physiological analysis. Polanyi thus anticipated some of the 
new directions, contrary to behaviourism, which have been taken recently in 
psychology. 

Nevertheless, as Kane points out, the development of these ideas led 
Polanyi into an inconsistency. He was a dualist in regard to the mind-body 
problem, and yet believed that mind had arisen, over evolutionary periods, 
from inanimate matter. To account for this he postulated the natural 
emergence of higher level operational principles of the type which he had 
quite reasonably attributed to artefacts. But whereas the latter principles are 
originated by humans, from whence came the operational principles which 
are said to account for the purposive aspects of human behaviour? This 
supposed ‘emergence’, in advance of the human mind, of such metaphysical 
entities as ‘principles’ seems inexplicable within the naturalistic framework 
which Polanyi adopts elsewhere. Kane is therefore inclined to regard 
Polanyi’s dualism as being merely rhetorical. 

Chapter 4 elaborates what has been said already about Polanyi’s theory of 
scientific discovery. What motivates the scientist is his sense of uncovering a 
hidden reality and what, for him, constitutes a good problem is one in which 
he already has an intuition of the existence of an ontological coherence 
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awaiting discovery, a coherence between all the items which he knows only 
tacitly and subsidiarily. Kane develops these ideas in terms of an 
analytic/integrative cycle, and also in terms of Poincaré’s four phases of the 
process of discovery—wviz. preparation, incubation, illumination and 
verification. 

This volume reads as a dissertation in educational philosophy, and indeed 
in the final chapter the author turns to the bearing of Polanyi’s epistemology 
on the teaching of science in schools. The educator’s task, he suggests, is not 
so much to inculcate a critical attitude in the pupil as to equip him with the 
subsidiary context, the ‘presuppositional image of reality’. It is said that, 
with this emphasis in teaching, the child’s heuristic vision of the world is 
extended. l 


KENNETH DENBIGH 
Chelsea College, Umversity of London 


ACHINSTEIN, PETER [1983]: The Nature of Explanation. Oxford University 
Press. ix +385 pp. (ISBN o-19-503215~—2). 


Peter Achinstein’s The Nature of Explanation* is one of the most significant 
additions to the literature on explanation since Hempel’s Aspects of 
Scientific Explanation. No discussion of explanation can with impunity 
overlook this book and the host of thought-provoking arguments contained 
in it. Even if one rejected every one of Achinstein’s theses, the book would be 
well worth reading simply for its perceptive critiques of the views on 
explanation of other philosophers. 

Although the book is centred on the two questions of the analysis and 
evaluation of explanations, closely related issues are also dealt with. There 
are chapters on the analysis of causation, theories of evidence, the limits of 
explanation, and functional explanation. In my view, the last mentioned 
chapter provides the most sophisticated analysis available of statements that 
ascribe a function to something. This review confines itself to the two central 
questions of the book. Even within this vastly reduced ambit, it is difficult to 
convey a proper sense of the richness of the discussion. 

Explanation, like many other words capable of a process-product 
ambiguity, can refer either to an act of explaining or to the result or product 
of such an act. Achinstein offers a challenge to the sort of views on 
explanation familiar from the writings of, e.g., Hempel and Salmon. His 
basic idea is that what is conceptually prior is the explaining act, the result of 
the act being analysable only in terms which refer to such an act. 

First, then, Achinstein offers an analysis of an explaining act. Simplifying 


* I am grateful to Peter Achinstein for a letter and a discussion about this review, which helped 
me to see what I wished to say. Confusions are, of course, still my own. 
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greatly on his account, to explain q (where ‘q’ is an indirect question of a 
certain sort to be specified in a moment), by uttering some sentence u, is to 
utter u with the intention that one’s so doing will render g understandable by 
producing the knowledge that the proposition expressed by u is a correct 
answer to g. 

But, then, what is understanding g? One might say that someone 
understood q iff he could in principle explain q, but of course this would be 
patently circular. Achinstein insists that his account is able to avoid 
circularity (pp. 71-2 and 259-61) and that he can account for understanding 
by means of ideas which can be ‘defined independently of explanatory 
notions’ (p. 260). Let’s see how this works, using a particular example to 
make the enterprise clear. 

Suppose the question is: Why did Nero fiddle? First, Achinstein 
introduces the idea of the complete presupposition of a question—in the 
example, it would be: Nero fiddled for some reason. Then, by means of 
grammatically specifiable transformations, we obtain the idea of a complete 
answer form for a question—in our example, it is: The reason that Nero 
fiddled is—. Next, Achinstein gives a grammatical characterisation of the 
idea of a content-giving sentence for a noun (in order to distinguish between 
such sentences as: (1) The reason that Nero fiddled is that he was happy; (2) 
The reason that Nero fiddled is difficult to grasp.). A proposition is said to be 
content-giving iff the sentence used to express it is content-giving (p. 39). 
Finally, a proposition p is a complete content-giving proposition for a 
question gq iff it is a content-giving proposition and is expressible by a 
sentence obtained from a complete answer form for g (for ease of exposition, 
I overlook other conditions irrelevant to my argument, and conflate direct 
and indirect questions). 

If we overlook some complications about answers to questions being 
restricted by instructions for answering the question, we can sum up 
Achinstein’s account of understanding like this: a person understands q iff 
the person knows of some proposition p that it is a correct answer to q, and p 
is a complete content-giving proposition with respect to g. The idea of a 
correct answer presents no great difficulty; basically, the idea that p is a 
correct answer to q is just that p is a complete content-giving proposition 
with respect to q, and that p is true. 

Now, among content-giving sentences, and therefore among complete 
content-giving propositions, there will be examples such as these: the reason 
that Nero fiddled is that he was happy; the explanation for Nero’s fiddling is 
that he was happy; the cause of Nero’s fiddling is his being happy. It is 
essential to Achinstein’s account of understanding, if it is to avoid 
circularity, that his analysans for understanding makes no use of 
explanation-related concepts. His grammatical characterisation of the 
analysans permits him to avoid using such concepts, the hope being that the 
grammatical characterisation of a complete content-giving proposition will 
pick out in fact just those propositions about explanation and so on that one 
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intuitively needs. Propositions about explanation (or causes or reasons) are 
examples of complete content-giving propositions, and such propositions 
can be mentioned, but no explanation-related concepts need be used in 
offering the analysans for understanding (a question). Achinstein’s avoid- 
ance of circularity stands or falls, as far as I can see, on the project of 
characterising the idea of a complete content-giving proposition grammati- 
cally, and using it in the analysis of understanding. 

It seems to me that the idea that one can elucidate understanding (and 
indirectly, explaining) in this way is hopeless. Recall our example: 
understanding why Nero fiddled. Frenchmen can understand why Nero 
fiddled, and yet the idea of a complete content-giving proposition for a 
question has been elucidated by means of the vocabulary and grammar of 
English. At best, we can have here only an account of understanding for 
English speakers. 

Perhaps this criticism is too swift. Suppose a Frenchman says: ‘La raison 
pour laquelle Neron jouait du violon est qu’il était heureux.’ If there exists a 
sentence-sentence translation manual for English and French, then we 
simply add to Achinstein’s analysans: if the person does not speak English, 
then we translate (into English) the proposition that he takes to be a correct 
answer to the question, and then see if the English translation meets the 
requirements laid down in his analysans. 

Could this constitute an illuminating account of understanding? I think 
not, but perhaps this depends on one’s philosophical views about what one 
expects from an analysis. The idea of an analysis, as I understand it, is to get 
at what is essential about an idea, in some sense. This is why extensional 
equivalence, by itself, is too weak a requirement for analysis. There might 
be an extensional equivalence between being an A and being a B (for 
instance) which was merely a lucky coincidence, and which therefore did not 
really reveal anything important or essential about the analysandum. This 
can be brought out by asking a counterfactual question: granted that being 
an A is extensionally equivalent to being a B, could there, in some sense, have 
been an A which was not a B, or a B which was not an A? 

This seems to me to be what is wrong with the suggestion that we keep 
Achinstein’s grammatical analysis of understanding but simply add a 
translation clause to it. Granted, it is extensionally necessary and sufficient 
for any non-English speaker to have understood why Nero fiddled that he 
know that some sentence in a language other than English is an answer to the 
question, a sentence that we can translate as “The reason that Nero fiddled is 
that he was happy’ (or a similar one in terms of explanation or cause). But 
suppose English had never existed. People could still understand why Nero 
fiddled, in spite of the fact that there was no English translation of what they 
thought (or knew) answered the question. Similarly, any account in any 
specific language will be unmotivated. Why produce the analysis in one 
‘home’ language rather than another? And any such account will fail the 
‘strong’ requirement of analysis that it get at something essential, because, 
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counterfactually, any language selected for the account could have failed to 
exist, and still people could have understood why Nero fiddled. 

Could Achinstein be construed as offering an account of understanding, 
but one relativised to a specific language, like English? This construal of 
what he is doing fails for the same reason I adduced above. I presume that 
the persistence conditions for a language are such as to permit at least some 
minor grammatical changes. 'To pick only one example more or less 
randomly, in elucidating the idea of a content-giving sentence, Achinstein 
refers to the fact that it uses a form of the verb to be. But imagine that English 
had been like Hebrew, so that there were no present tense form of the verb to 
be. In such a case, one could have understood why Nero fiddled (in exactly 
the same sense as that in which one can now understand this), and yet the 
analysans offered by Achinstein would have been false. I submit that 
Achinstein cannot have offered us a grammatical analysts of understanding 
(even if relativised to a specific language), because even if what I have called 
‘the grammatical account’ and understanding-for-English were extension- 
ally equivalent, the former is superficial, does not really get at what 
understanding is, since the grammatical account could have been different 
and it could still have been true that one had understood, even relativised to 
English (it could still be English, with some small changes to the rules of 
grammar). 

If we then dismiss Achinstein’s grammatical account of understanding, 
what might we put in its place? I don’t say that any non-grammatical 
account of understanding is bound to be circular if understanding is 
employed to provide the analysis of explanation, only that Achinstein 
pinned his hopes of avoiding circularity on the grammatical account. 
Whether there is a non-circular alternative remains still to be seen. 

After discussing explaining acts, Achinstein then turns to the question of 
explanation, in the sense of a product of such an act. It is this part of the book 
which lies closest to the analysis of explanation, as traditionally conceived. 
According to Achinstein, an explanation-product is neither a proposition 
nor an argument, nor any other entity which can be characterised solely 
syntactically or semantically. Rather, an explanation-product is an ordered 
pair, consisting to be sure of a proposition (indeed, a complete content- 
giving proposition, such as ‘The reason that Nero fiddled is that he was 
happy’) but also of an explaining act type (e.g., the act type, explaining why 
Nero fiddled). If I utter the same proposition, both in an act of criticising 
Nero for this extraordinary behaviour (to Metullus, say), my criticism is the 
ordered pair consisting of (perhaps) that proposition and the act type, 
criticising Nero’s fiddling, but my explanation is the different ordered pair 
consisting of that same proposition but the act type, explaining why Nero 
fiddled. 

Why can’t the product of an act of explaining be identical with the product 
of an act of criticising (e.g., when the same proposition is involved)? My view 
is that they can be, and that Achinstein’s ordered pair view is unnecessarily 
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complicated (I leave open the question for now whether explanation- 
products are arguments or propositions or some other entity which can be 
syntactically or semantically identified). 

Achinstein wants to demonstrate that no explanation can be identical with 
a criticism by the following argument, which in turn relies on this principle 
(p. 76): 


(1) The product of S’s illocutionary act is an illocutionary product F only if 
S F-ed. 


Suppose that I explain why Nero fiddled by saying that the reason he fiddled 
was that he was happy. You, believing that happiness is no good reason for 
fiddling instead of helping to put out fires in such circumstances, criticise 
Nero by saying exactly the same thing. Now, says Achinstein, your criticism 
is a criticism. But if my explanation = your criticism, then my explanation 
would be a criticism. But it may not be, since I may have no intention of 
criticising Nero at all (I may believe that this is exactly the sort of thing 
emperors should do in such circumstances). But if my explanation is no 
criticism, it must be false that my explanation is identical to your criticism 
(for the other premiss, that your criticism is a criticism, is surely beyond 
dispute). Since the propositions expressed are identical, it follows, accord- 
ing to Achinstein, that the criticism and the argument cannot just be the 
proposition. He concludes, as I indicated, that they are ordered pairs, 
consisting of the proposition, and a criticising and explaining act type 
respectively. 

I think that one of the premisses in Achinstein’s argument above is simply 
false. Achinstein assumes that it is false that my explanation is a criticism 
when J am not engaging in the act of criticising. But this does not seem right, 
for my explanation is a criticism, to wit, a criticism by you. What is wrong is 
(x), which should be replaced by (1#): 


(1*) The product of $’s illocutionary act is an illocutionary product F only 
if someone F-ed (perhaps but not necessarily S). 


(1*), unlike (1), has the merit of being true, but it will not permit Achinstein 
to draw the conclusion that he wants, that no criticism can be identical with 
an explanation (in the product sense). I hold that if I criticise and you 
explain by saying the same thing, then my criticism = your explanation. 
Achinstein has given us no good reason to think that the products of 
different illocutionary acts can never be identical. 

If we accept the proposition (or argument) rather than the ordered pair 
view of explanations, we can characterise explanation-products quite in- 
dependently of the idea of explaining acts. Explaining acts find their way 
into the account of explanation-products only on the ordered pair view of 
Achinstein. On the simpler view which I advocate, accounts of the act and 
the. product of explaining can be given independently of one another. Of 
course, we only call a proposition (or ordered pair or set of propositions) an 
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explanation if it is one that figures in an explaining act in a certain way. But 
that is no more reason to deny that an explanation may be identical to a 
criticism than there is to deny that the Morning Star is identical to the 
Evening Star, on the grounds that we only call the heavenly body the former 
when it appears in the morning, and only call it the latter when it appears in 
the evening. 

In private communication, Achinstein has claimed that my simpler 
theory will yield conclusions that are either unintuitive or false. Here is 
the crux of our disagreement: 

I claim that (1) my explanation of why Nero fiddled = your criticism of Nero’s 
fiddling. 

Suppose that (2) your criticism of Nero is unfair. 

It follows that (3) my explanation of why Nero fiddled is unfair. 


Achinstein claims that (3) is not true; (3) is either false or semantically deviant. 
Since ex hypothesi (2) is true, (1) must be false. 


I reply that there is an equivocal use of ‘criticism’ in premiss (1) and (2), 
which renders the argument invalid. In (1), ‘criticism’ (and ‘explanation’) 
must be taken in the product sense. In (2), ‘criticism’ must be taken in the act 
sense, since unfairness is a property of acts and not of propositions (nor, for 
that matter, of ordered pairs of propositions and act types). Achinstein has 
indicated to me that he intended, in order to avoid the charge of 
equivocation, that in (2), ‘criticism’ should be taken in the product sense, as 
it is in (1). I cannot understand how products of explaining and criticising 
acts, as distinct from the acts themselves, could be fair or unfair in the 
required sense. It is true that we sometimes make remarks such as: your 
criticism was fair, but it was unfair of you to make it. In that sort of remark, I 
take ‘fair’ to be equivocal. To call an act of criticising ‘fair’ is to morally 
appraise it; to call the criticism itself (the product) ‘fair’ is to say that it is true 
or justified in an epistemic sense. 

Chapters 4 and 5 of The Nature of Explanation are concerned in different 
ways with the requirements for explanation. Goodness of an explanation 
(product) is assessed along two parameters (confining ourselves to epistemic 
evaluation): correctness and appropriateness. I have already mentioned 
correctness; in essence, the correctness of an explanation is its truth (“The 
reason why Nero fiddled is that he was happy’ is a correct explanation of his 
fiddling if it is true). Although Achinstein does not explicate appropriate- 
ness in quite this way, it has to do, inter alia, with the problem of explanatory 
depth. Shall I explain why Jones (poor chap, he’s so often ill) died by saying 
that he died of an illness, from food poisoning, from eating spoiled meat, 
from such-and-such chemical change in his body? All of these explanations 
might be correct. Achinstein’s thought here is that which correct answer I 
give (as he says, which set of instructions I follow in answering the question) 
will depend on the needs and interests of my audience. And these needs and 
interests will vary. No one of the set of correct answers is better or more 
appropriate tout court. Achinstein considers numerous examples from the 
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history of science, and argues that there is no level of depth, no one set of 
instructions, universally required for all explanations, or even for all 
scientific explanations. As one tries to formulate what was required of such 
explanations on the examined occasions, Achinstein maintains that too 
much specific content has to be built into the instructions for giving the 
explanations to represent them as universally required for all explanations. 
His conclusion is that standards of appropriateness for explanation are 
contextual and pragmatic, and cannot be generalised to have universal 
scope. 

What of correctness? Can one find further conditions for the correctness 
of an explanation, other than its truth (e.g., deductive validity if explanations 
were arguments)? It is in the discussion of this question that I found 
Achinstein to be at his most challenging. For Achinstein, it will be recalled, 
we explain a question q (Suppose that it is ‘Why p?) with a complete 
content-giving proposition (Suppose that it is “The reason that p is that r.’). 
To put this in another way: where the explanandum is that p, the explanans 
is the reason that p ts that r. Now, this explanans offends a requirement of 
explanation that almost every writer on explanation has insisted upon: no 
singular sentence, by itself and in the absence of a law-like sentence, can 
entail the explanandum. In the case we are considering, that p simply follows 
deductively from the reason that p is that r. (Recall that Achinstein does not 
regard his account as circular, in spite of the fact that terms like ‘reason’ and 
‘explains’ occur in the explanans, since he thinks that all one needs is the idea 
of a complete content-giving proposition, it just being a matter of fact that 
some examples of such propositions happen to include these terms). Why 
does Achinstein hold that explanations are bound to offend this requirement 
that no singular sentence by itself can entail the explanandum? 

Suppose that poor Jones eats 1 Ib. of arsenic at t and dies within 24 hours. 
Suppose further that it is a law of nature that anyone who eats at least 1 Ib. of 
arsenic dies within 24 hours of his eating the arsenic: 


(1) Jones ate 1 lb. of arsenic at time t. 
(2) Anyone who eats at least x lb. of arsenic dies within 24 hours. 
<. (3) Jones dies within 24 hours of t. ' 


Achinstein argues, correctly I think, that this cannot be a good explanation 
of Jones’ death, since premisses and conclusion can be true and yet the 
premisses fail to explain the conclusion. Suppose shortly after arsenic 
ingestion Jones is run over by a lorry and dies. If so, the argument permits us 
to validly infer his death from premisses (1) and (2), but it won’t help us to 
explain his death since the premisses do not tell us what brought about his 
death. 

Achinstein argues, convincingly, that the deductive relation between 
premisses and conclusion of an argument (even in conjunction with other 
formal conditions and the thought that the premisses are true) will not 
capture the idea of the premisses explaining the conclusion. In order to 
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insure explanation in a case such as this (but not necessarily in all cases of 
explanation, since Achinstein rejects the idea that all explanations are causal 
explanations), there would have to be, among the premisses, a singular 
sentence that actually says that some event is the cause of, reason for, or 
explains, the event or whatever mentioned in the conclusion. (Or, alterna- 
tively, if there is no premiss which says this, then there must be a condition 
placed on which inferences will be acceptable as explanations—and this 
condition will itself make essential use of explanation-related terms). But 
note that such a singular sentence would itself entail the explanandum as a 
conclusion. So, according to Achinstein, good explanations, if they were 
arguments, would offend the requirement that no singular sentence in the 
premisses can by itself entail the conclusion. This also constitutes a strong 
reason for taking explanations to be propositions (with or without act types) 
rather than arguments of any sort, even ones in which a singular sentence as 
sole premiss entails a conclusion, for the view of them as arguments of this 
sort would not really add anything to the simpler view of them as 
propositions which say what the explanation of or reason for something is. 


DAVID-HILLEL RUBEN 
The London School of Economics 
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An Approach to the a of the 


Physical Continuum’ 


by RICHARD JOZSA 


ABSTRACT 


We describe a way of constructing models for the continuum which does not require 
an underlying structure of points. With a condition of spatial homogeneity the 
models have the mathematical structure of a sheaf. 


The physical continuum and its properties have been modelled for 
hundreds of years in terms of the real line R and its rich mathematical 
structure. From time to time other models have been proposed but none has 
been adopted into any mainstream theory. The immense computational 
success in physics, of the techniques of real analysis, is surprising if we 
consider that the use of the real line has always been accompanied by 
problems, although mainly of a conceptual or interpretational nature. From 
the beginning there were difficulties with the idea of continuous motion, and 
how a particle occupying one position changes to the ‘next position’; and the 
foundations of calculus caused much controversy. More recently, in 
classical physics, difficulties arose in attempting to reconcile point particle 
sources with field theories (Feynman, Leighton and Sands [1965]). This 
became especially acute in the unsuccessful attempts of Lorentz, Poincare 
and others, to construct a classical model of the electron, leading to problems 
of infinite self energy and the phenomenon of radiation reaction predicting 
. incorrect motions (Jackson [1975]). These difficulties are related to effects at 
very small length scales, and the advent of quantum theory provided a 
convenient reason for ignoring them, while retaining the theory (to a very 
good approximation) at larger scales. Thus, the attitude was that at small 
length scales, the classical laws break down and are replaced by the quantum 
formalism; and the quantum formalism still retains the same mathematical 
model of the continuum. We wish to suggest here that perhaps it is not so 
much the statement of the physical laws themselves that needs to be refined 
but instead the mathematical model of the continuum needs to be changed 
while largely retaining the conceptual structure of the classical theory. 
In the quantum theory of a particle, the attribute of position (and most 
other physical propérties) whėn they exist, still take values in the classical 
model R, while the conceptual structure and language of the theory have 


` 1 Thanks to Jenny Little and John Fitzgibbon for useful discussions. 
DD 
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been radically changed, and as a result, many interpretational problems 
arise. In Schrodinger’s form of the theory, a particle is described by a 
wavefunction (replacing the classical data of its position and momentum as 
real numbers). Using the real line model for position, the following 
interpretational probems arise: 


(i) the particle need not have the attribute of a definite position (as a real 
number) but is generally in a ‘superposition’ of being at many 
different places. 

(ii) ifa position measurement is carried out, the wavefunction ‘collapses’ 
discontinuously, and this change has no physical mechanism. 

(iii) the result of a position measurement (as a real number) can be 
predicted only probabilistically by the theory. 


We will give a simple illustration showing that interpretational 
difficulties, resembling the above ones, can arise from an inappropriate 
choice of mathematical structure to model a physical property. For the sake 
of the illustration, suppose the world is exactly as described by Newtonian 
mechanics (so that particle positions really are points in R°). Suppose also 
that our eyes have poor resolution, and when we look at a particle we can 
only tell whether it is in front of the left (L) or right (R) eye. Under these 
circumstances we would naturally model the particle position by the two- 
element set {L, R}. Suppose finally that we are able to tilt our head sideways, 
and view particles at an angle a, and denote the position values by L,, R,. We 
note that any particle, if measured, has a definite value in {L, R} and in {L,, 
R,} and we wish to investigate experimentally the relation between these 
two attributes. To this end we prepare a large number of particles in the state 
Land view them at tilt angle a (cf. Figure 1). We find to our surprise that the 
value is sometimes L, sometimes R,; in fact L, occurs with probability a/z 
and R, with probability (x—.«)/x. After the measurement however, the seen 
value of L, or R, persists. It would be tempting to theorize from this 
experiment as follows: 


(i) Each particle in state L is not in a definite state of L, or R,. 
(ii) Ifan {L,, R,}-measurement is carried out, the state collapses from L 


Figure 1. 
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to either L, or R,. The collapse, somehow induced by the 
measurement has no physical mechanism. 

(iii) The result of the measurement in (ii) can only be known 
probabilistically. 


The analogy with the quantum problems is evident. We emphasise that 
the alleged change of state upon measurement has no reality as a physical 
process at all and is entirely a mathematical artifice resulting from the 
inappropriate mathematical model of position. We could consistently build 
our theories on the {L, R} model and be forever plagued by interpretational 
problems. Alternatively we could seek a better model and construct a theory 
with a sound interpretational basis. The above illustration is similar to the 
explanation of the ‘mechanism’ of wavefunction collapse offered by some 
hidden variable theories (in which the wavefunction does not represent the 
complete specification of the state): if a system is described by some kind of 
probability density function p, then the acquisition of new knowledge about 
the system (as a result of a measurement) will require a change in p to 
incorporate the new information (t.e. replacing probabilities by conditional 
probabilities, contingent on the new information). Thus the description of 
the system will be altered discontinuously even though the system itself may 
suffer no physical change at all. The point of view we wish to advocate here is 
different from this approach (and appears not to have been previously 
considered in the large literature on hidden variables): instead of 
introducing extra real variables to supplement the probabilistic description 
of quantum mechanics, we abandon the requirement that all measurement 
results are real numbers. We suggest that only the standard classical 
properties be used, but the mathematical structures used to model their 
values should be enriched to incorporate ‘non classical’ behaviour. The rest 
of this article is devoted to providing a possible approach for the 
construction of such alternative models for the property of position. 

J. L. Bell in Bell [1983], [1984] considers an example which bears some 
similarity to the one described above. His characterisation of quantum logic 
using a notion of ‘non-persistent forcing’ can be used to provide an 
alternative approach to the problems we discuss below. 

Perhaps the most idealised aspect of the real line model is its structure of 
points which are infinitely precisely distinguishable and independent. 
Rather than embodying the crucial idea of continuity, the existence of points 
epitomises an extreme form of discontinuity. Furthermore this idealisation 
is based on classical precepts, which have been overthrown by relativistic 
quantum mechanics (Chew[1963]). 

We wish to construct models which do not have a structure of points 
underlying the concept of localisation in the continuum. We begin by 
considering all measures of localisation on an equal footing and establish 
properties of the intuitive idea of localisation which do not depend on the 
existence of points. Exactly what is meant by a measure of localisation, and 
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how it is represented mathematically, will depend on further physical 
properties, and here we will impose only a minimum of structure, which is 
expected always to apply. Denote the set of all localisations in the continuum 
by @. Some localisations are weaker versions of others and we represent this 
by the structure of a partial order on @: if 4, L e € then J, < L means that a 
particle at 1, may be regarded as also being at J, (although this may provide 
less information about its position). We will also denote the relation J, < h 
by an arrow l, > l}. Asa simple example, consider position in one dimension 
along the real line R. We take © to be the set of all (closed) intervals and /, 
— l, iff ‘the interval /, is contained in the interval /,’. This model describes 
the familiar representation of measurements with error bars. The fact that 
intervals and their inclusion relations can be made up from points, is 
represented by extra structural properties of this partial order which need 
not hold in more general partial orders. 

Next we wish to incorporate the property of spatial homogeneity into the 
model. Homogeneity on the real line is usually expressed using the action of 
the translations. The notion of translation, through a fixed distance, itself 
presupposes the existence of points and here we must adopt a different 
approach. Homogeneity requires that the same structure of localisation 
exists at different places. In this case, the partial order @ is built up of a 
patchwork of overlapping copies of a smaller structure which represents the 
generic structure of localisation at any given ‘place’. We represent this by 
requiring that the partial order @ fibres over a base structure &, representing 
the generic structure of possible localisations. The notion of homogeneity 
requires that all arrows in @ are induced from arrows between elements of # 
in the following sense. Let c, be in the fibre over b,e@ and suppose that c, 
> c, with c, over bz. Then by homogeneity, for any c4 over b, we must havea 
ch over b, with c4 +c, i.e. any two places c,, c} with the same generic 
localisation b; are homogeneously equivalent in the sense that the 
structure of all localisations relating to c, is indistinguishable from the 
structure of all localisations relating to c, in €. Thus the partial order 
structure of @ is induced by lifting a structure of arrows from the base # to 
the fibres. We emphasise that # need not be a partial order, i.e. we may have 
more than one arrow between b, and b, in @ (cf. example below). Intuitively 
this corresponds to the possibility that if b, is a sharply defined localisation, 
and b, only broadly defined, then, having a position localised to extent b, is 
consistent with many different positions localised to extent b,. The 
transitivity of the partial order on @ requires that the arrows in # have a 
composition relation, i.e. Æ is a category in the mathematical sense (cf. 
MacLane [1971]). In an analogy with the action of the translation group G 
on the real line, Æ would correspond to the quotient space R/G (i.e. a single 
point, the only measure of localisation in the real line model) and the fibres 
correspond to the orbits of the group action (i.e. all possible places that the 
localisation can occur). Collecting all of these properties of the fibre 
structure of & for a homogeneous continuum we arrive at the mathematical 
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notion of a presheaf. (Actually, to coincide with standard sheaf theoretic 
terminology, we must reverse all arrows in S, but not in @, i.e. when we lift 
an arrow from # to @ we reverse its direction. We adopt this convention in 
all that follows. Thus arrows in # flow in the direction of increasing 
sharpness of localisation, and in @, in the direction of weakening the 
localisation.) 


Definition. A presheaf F on a category @ is a contravariant functor F from B 
to the category of sets (cf. MacLane [1971]). = bea, F(b) is called the set of 
elements of F over b and each map bib in @ gives rise to a map 


F(b) E, F (b), called the restriction map along $. 


The partial order @ of closed intervals on the real line with inclusions 
(considered previously) provides an example of a homogeneous continuum 
in the above sense. In this example, the base category # and the presheaf F 
are constructed as follows. The objects of # (‘measures of localisation’) are 
‘floating’ intervals of all possible lengths, i.e. intervals on R with length 
specified but with position (midpoint) not specified. We denote the interval 
of length 2b simply by b. Here b is any positive number and we allow the 
extreme cases b = o and b = œ. Arrows from b, to b, are to correspond 
exactly to all the ways that an interval of length 2b, can be included inside an 
interval of length 2b,. Thus for an arrow to exist we require b, < b, and 


there is an arrow b, —> b; for each real number € such that |¢] < 6, —6,. In 


the presheaf € will measure the displacement of the centre of the b4 interval 
relative to the centre of the b, interval. The rule of composition of arrows is 


é " +r 
simply addition: b, ——>b—> b, composes to give b, —->b,. (‘This is well 


defined because of the triangle inequality ensuring that if |Z| < b3b, and 
in] < 6b, then |&4+-n| < |é/+-|n| < b5b1.) For the presheaf F we take F(b) 
td 


= R for each b, and if b, «——b, in @, we define restriction along € by 


FE: F(b) + F (b2) 
x} x—č. 


An element x over 5, represents the interval of length 2b, centred at x, 
[x-—-b;, x+b,]. The restriction map #(&) acting on x represents the 
inclusion [x—b,,x+b,] = [x—€—62, x— č +5], i.e. if a value is known to lie 
in the interval [x—b,, x+5,] then it is also known to lie in many intervals of 
length 2b, > 2b,, which are given exactly by the restrictions along all 
possible arrows b, + bz. It is easy to verify that the set of all elements of this 
presheaf, together with all restriction maps, reproduces the partial order of 
intervals on R with inclusions. 

The classical real line model of point positions can be expressed as a 
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(trivial) presheaf. Here £ is the category with one object, e, and one arrow 1: 
e—e. The fibre over e is R the set of all point positions, which are all 
homogeneously equivalent. Restriction along 1 is the identity map from R to 
R. 

Although the above presheaf of intervals on R allows for position values 
which are not points, we may object that the sharp endpoints of intervals still 
reflect an underlying structure of points. We describe another example of a 
presheaf which can remedy this problem. Instead of using intervals to 
measure localisation, we use, for the partial order @, some class of functions 
(say, square integrable, positive functions, L?.(R)) ġ: R > R which fall off to 
zero sufficiently rapidly at infinity in both directions. In this context, we 
think of the particle localisation @ as representing the ‘spread of the 
particle’s existence’ rather than the probability density of pointlike position 
values along R; i.e. the real line itself has no direct interpretation as a set of 
possible position values, but merely underlies the mathematical description 
of the model. Arrows in @ are defined by 


$ 
$,:—— $2 
iff ġa = pie. 


Here, č: R > R is a function and æ denotes convolutions of functions: 


Q(x) = i. bs (x—dE(2)dt. 


We require č to be also a function in L (R), and have zero mean: 

f xč(xjdx = o. 
We see that $ (x) represents the effect of smearing (averaging) the values of 
$ around x where the average is weighted by €(2) for values of ¢, at distance 


č 
t from x. Hence if ¢,—— @ then, in a natural sense, $2 is a less localised 


function than ¢, (which motivates the use of convolution in this model). 
To express @ as a presheaf, let # be the category with one object (called P), 
and the arrows from P to P are the functions ¢ above. If €, and č, are two 
arrows, we define composition in # by 


&,°& = Č, # Č, (convolution). 


(We must also allow the Dirac delta function 6(*) as an arrow in # since, for 
the above composition, 6 is the identity arrow: ô æ č = če ô = č forall č.) The 
presheaf over # has L7 (R) (i.e. the set of all elements of € as fibre over P and 
restriction along ¢ is given by convolution 


ot get = peL7(R). 


We note that this model bears a slight resemblance to the description of 
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position in Schrodinger’s form of quantum mechanics, and generalises the 
classical point model by allowing points to have an ‘internal structure’ of the 
monoid of functions €, with convolution. 

In the heuristic arguments above leading to presheaf models, we made no 
explicit distinction between, on the one hand, the representation of 
quantities which are physically in principle not localisable to a point and, on 
the other hand, the representation of quantities which may be localisable, 
but have not been specified completely, e.g. the possible results of a 
measurement involving errors, made by an imperfect apparatus. In the 
latter case, we can think of the structure # as representing the various 
amounts of information that we can extract about a property, and the 
relations between them. The restriction maps then correspond to the 
process of ignoring part of the given information, producing a less accurate 
value from a more precise one. 

Thinking in these terms of information content, we can motivate in a 
natural way, a further refinement of the model: from a presheaf to a sheaf (in 
the sense of Grothendieck). This notion of sheaf requires further structure 
on the base category #: 


Definition. Let @ be a category with pullbacks. A Grothendieck topology on 
B ts given by the following data. For each object be@ we specify a set Cov(b) (of 
‘covers’ of b). Each cover is a family of arrows into b and the following axioms 
must be satisfied: 


(1) {bb} (the set consisting of just the identity arrow on b) is always 
in Covi). 

(2) If {b, —> b},eCov(b) and {b,; > b,},eCov(b,) then {by > b; > b}yeCov(b) 
(i.e. ‘cavers of covers are covers’). 

(3) (stability wider pullbacks) If {b, > b},sCov(b) and b ġ—>b is any 
arrow then the pullbacks {b; > b'} of {b, > b}, along ġ form a cover of b'. 


Definition. Let ® be a category equipped with a Grothendieck topology. A 
sheaf F on J is a presheaf F satisfying the following completeneness condition 
with respect to covers: 


Sı 
Let {b,——> b}; be any cover. Let 


hy 
by j bi 





gij Í 





bj F, 


be the pullback of fı and f, in the cover. 
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Let of (b;) be any collection of elements of F over the b, which agree when 
restricted to the pullback b;, i.e. 


Fha) = F (gyo) for alli, j. 


Then there is a unique element oef (b) over b which generates the family o; by 
restrictions along the cover: 


o= Ffo) for eachi. 


These technical definitions can be clarified by considering a fundamental 
example: a sheaf on a topological space X. In this case # is the category of 
open sets of X with arrows corresponding to inclusions of open sets. The 
pullback of U; œ U and U; & U is then just the intersection Uy = U; N U;. 
The Grothendieck topology is defined by open covers in the usual sense, t.e. 
{U; o U} is in Cov(U) iff U U = U. Consider the sheaf F for which F(U) 

i 


is the set of all continuous maps f: U —> R, and the restriction along Vg U is 
given the actual restriction of a function on U to the subset V. Then the 
. completeness condition for this sheaf states that if we have a family of local 
functions g: U > R over an open cover U; of U, which agree when restricted 
to the overlaps U, n U; then they can be patched together in a unique way to 
give a single function g: U > R which reproduces all the o; by restriction to 
the subsets U; of U. 

In terms of information content of imperfect measurements, the 
mathematical structures in the preceding definitions can be interpreted as 
follows. As before, the base @ represents the different amounts of 
information which may be obtained. To interpret the structure of covers we 
envisage a situation in which several imperfect measurements can be made 
on a single quantity. We wish to be able to piece together the separate 
information of these measurements into a single result which will be more 
precise than each of the given results. Thus we take {b; + b}; to be a cover of b 
iff the joint information content of the 5,’s can be used to provide the 
information content of b. Cov(b) provides a list of all the different ways that b 
can be constructed out of weaker pieces. The completeness condition for 
sheaves provides the construction of the more precise value from less precise 
values over a cover. With these interpretations it can be readily argued that 
the various axiomatic requirements in the sheaf definitions are all natural 
conditions to impose and we omit the details (Jozsa [1981]). The use of sheaf 
models in quantum mechanics has been advocated from a different point of 
view by Takeuti [1980] and Davis [1977]. 

The example of intervals on R can be made into a sheaf by introducing 


covers as follows. {b,—‘-»b},eCov(b) iffn [—b, + č bi +ë] = [—b, b]. This 


corresponds to the statement that if a value is known to lie in each of a 
collection of intervals, then it should be in their intersection. (We may also 
require covers here to be finite families). It can be verified that the category 
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@ in this case has pullbacks, and the conditions in the sheaf definitions are 
satisfied, and the presheaf described previously becomes a sheaf (Jozsa 
[1981]). 

Suppose that the base category # has a terminal object bo (i.e. for any beg 
there is a unique arrow bọ + b). Then for any sheaf F over #, the elements 
F (bo) over by (called global elements) have the maximum possible sharpness 
of localisation, since any other amount of information b can be extended to bo 
by the existence of the arrow b —> bo, and thus are analogous to the point 
elements of a set model. In any sheaf or presheaf, each global element gives 
rise to an element over any be@ (by restriction along bọ + b). However, in 
general, the global elements will not give rise to all elements over be# and 
F (bo) may even be empty. Thus the interlocking structure of all elements of 
F and their restrictions, is not generated by an underlying structure of 
independent global elements. In this sense, a sheaf model of the continuum 
need not be based on a structure of independent pointlike locations, and thus 
successfully circumvents our basic objection to the use of the real line. 

Finally, we mention two remarks on the use of sheaves or presheaves as 
models in physics. Firstly, the models we have introduced are of great 
generality and therefore not very illuminating for the intended particular 
application to the physical continuum. In effect, we have replaced the 
conventional notion of a point by a general category Ẹ. (Recall that sets are a 
special case of sheaf models, for which # is the one object category with only 
the identity arrow.) It is intended that further physical input about the 
nature of localisation can be used to restrict the choice of #. 

Secondly, even if sheaf models were successfully established, we would 
require techniques analogous to those of conventional mathematical 
analysis, to develop physical theories within the context of sheaf structures. 
A great deal of mathematical development has already occurred in this 
direction (Johnstone [1977], Kock [1981]). We mention here only that the 
collection of all sheaves over a fixed base # has many of the structural 
properties of the collection of all sets, and this comparison can be made 
precise using a formal language which enables sheaves to be manipulated as 
though they were sets (Johnstone [1977], Jozsa [1984], Kock [1981]). There 
is a formal difference however. Set constructions always respect the laws of 
classical logic whereas the analogous sheaf constructions always respect the 
laws of intuitionistic (or constructive) logic. Intuitionistic logic originated in 
the work of L. E. J. Brouwer, in an attempt to develop a mathematical 
system which used only constructive methods. It was recognised early on 
that there is a close connection in mathematical analysis, between the notion 
of constructivity and the notion of continuity, since both represent a kind of 
stability under small perturbations in initial conditions. Thus the fact that 
sheaves provide models for the formalisation of constructive mathematics 
lends further credence to a role for sheaves in a description of an 
‘intrinsically continuous’ continuum. 

University of New South Wales 
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On Time and Actuality: The Dilemma 
of Privileged Position 


by PALLE YOURGRAU 


“Intuitively, we look at matters thus: K is the set of all ‘possible words’; G is the ‘real 
world’.” 


Saul Kripke, ‘‘Semantical Considerations on Modal Logic” (Kripke [1963]) 


Introduction 

Time, and the Semantics of ‘now’ 

The Semantics and Metaphystes of the ‘actual world’ 
Referring to Time and Modality 

Movement in Time and Modality 

Conclusion: Is Lewis Correct? 
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I INTRODUCTION 


It has seemed obvious to many that while the present reflects nothing 
objective in the nature of things, the actual world is the very stuff reality is 
made of. With the notable exception of David Lewis, most philosophers 
have argued for a fundamental asymmetry concerning our position in time 
and our location in ‘modal space’. Whereas it appears to be a commonplace 
(given the Minkowskian geometry of space-time) that there is no ontological 
privilege accorded the present, the actual world is typically granted special 
privileges, over and above its counterparts. Given the prevalence of this 
asymmetrical point of view, the question arises whether the doctrine of 
‘privileged position’ in modality but not in time can be successfully 
defended. My primary aim, here, is to draw attention to the fact that while 
the arguments marshaled against the doctrine of ‘temporal chauvinism’ are 
powerful and persuasive (which is not to say, sound [see Yourgrau [1985]]), 
those that have been offered against Lewis, in favour of ‘modal chauvinism’, 
are characterized rather by a surprising lack of rigor. To this end, I will 
examine three representative arguments against Lewis, offered by 
Stalnaker, Inwagen, and Forbes/Davies. We begin with the term, “now”, to 
see how it is typically analyzed, and to examine the view of time that emerges 
from that treatment. We will then compare this with the very different 
treatment accorded “‘actual’’. 
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2 TIME, AND THE SEMANTICS OF “NOW” 


J. M. E. McTaggart argued that time (if it exists—which he denied) has two 
fundamental features, which he called the A-series and the B-series 
(McTaggart [1927]). The former represents an event’s occurring now, or in 
the present, while the latter expresses facts like one event’s occurring earlier 
or later than another. Philosophers sympathetic with McTaggart argue that 
the only events that are really occurring are those that are in the present—1.e. 
that are occurring now. Modern physics, however, typically ignores the A- 
series, and this can be seen as justified if we accept the analysis of indexicals 
(“I,” “here,” “now”) offered by John Perry, which dominates the 
philosophical scene today (Perry [1977]). According to Perry, the classical 
Fregean theory of dividing the semantic value of a singular term into sense 
and reference fails for indexicals. For such terms there just are no Fregean 
senses. In using them, therefore, one is not thereby attributing some ‘sense’ 
(or property) to what is being referred to. The picture suggested by Perry is 
rather that in learning to use such terms one grasps the associated role. The 
role, for example, of “T”, is to refer (simply) to the speaker. Grasping such a 
role, one is able to achieve a reference, but without the intermediary of a 
Fregean sense—without, that is, attributing some property to the referent. 
Exploiting Perry’s account as it applies to the term “now”, G. H. Mellor 
(Mellor [1974]) argued that since the ‘role’ of this term (it refers, when used 
at time t, to t) exhausts its semantic contribution, McTaggart’s A-series 
does not reflect an objective feature of time. That an event occurs now, on 
this view, does not reflect a genuine property of the event, but rather the fact 
that a speaker, expressing this thought at time t, is able thereby to refer, 
simply, to t. 

This view represents a purely semantic argument against the objective 
significance of the A-series, in terms of an analysis of “now”. I have argued 
elsewhere that this semantic theory of indexicals is seriously inadequate 
(Yourgrau [1982]; ([1987]a, [1987]b), and will not, therefore, indulge here 
in a critique. My purpose is, rather, to compare philosophers’ treatment of 
time and modality, and since Perry’s analysis of indexicals is still the domin- 
ant one, it is the one that I shall focus on. But in addition to this semantic 
critique of McTaggart’s A-series there is one based upon considerations 
arising from the Special Theory of Relativity. Hilary Putnam (Putnam 
[1967]) and also Mellor (Mellor [1974]) have argued that given the relativity 
of simultaneity in Einstein’s theory, there is no such class as the class of all 
events simultaneous with one occurring now. More precisely, 
the problem is that: (a) from different inertial frames different events will 
appear simultaneous with a given event held to occur now, and (b) if it is held 
that only what is occurring now is what is really happening, then reality itself 
will differ, depending upon which position we view it from. The B-series 
survives this criticism, since events with a ‘time-like’ separation will, even in 
the Special Theory of Relativity, be, in an absolute sense, one earlier, one 
later, than the other. 
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This argument against the A-series on behalf of Special Relativity is also 
one that I have misgivings about,! but again, given its dominance and the 
prevalence of the view of the nature of time that emerges, it serves as a useful 
base in our comparison of treatments of time and modality. What is the view 
of the nature of time that has emerged from this tradition? The picture is one 
given by Minkowski in terms of a four dimensional space-time continuum. 
The so-called Minkowski diagram shows us something’s path through time 
in terms of a series of ordered pairs, representing the object’s being at 
position p at time t, and at p’ at t’. Suppressing subtleties, a Minkowski 
diagarm looks like this: 





The t-axis gives us time, and the p-axis gives the three spatial dimensions. 
The wavy line presents the Space-Time Path of some object, x, and the 
dotted lines mark out the ‘light cones’. The light cones, by showing the path 
light would take, mark out absolute limitations in the possible Space-Time 
Paths available to x. There are a number of important features to note in 
regard to the Minkowski diagram. (a) No special significance is given to a 
time t being now, or in the present. Times (like places) are given by 
‘canonical names’, typically representing real numbers. (b) The model 
represents the reduction of a ‘dynamic’ process to a ‘static’ conception. 
Motion in time is captured by an atemporal diagram. One is not supposed to 
imagine x moving along the path in the Minkowski diagram. Rather, the path 
in the diagram is held to completely capture x’s ‘movement’ through time. 
For x to move in time just ts for it to be (tenseless) at place p at t, and 
(tenseless) at place p’ (#p) at time t’ (#t). Given the standard Cartesian 
treatment of such co-ordinate systems, we can reproduce the entire 
Minkowski diagram in a purely mathematical way, as a set of ordered pairs, 
<p, t>, of real numbers. (Of course since p represents the three spatial 
dimensions, what we will really have is ordered quadruples.) The ‘flow’ of 
time is now fully captured in a ‘static’, mathematical, model. (c) We should 
not read the Minkowski diagram as showing that all events are simultaneous. 


' Lawrence Sklar also has misgivings about this argument. See (Sklar [1981]). 
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Clearly, according to the diagram, different events will occur at different 
times. What is implied, however, is that different times are equally ‘real’; the 
fact that time t occurs in the future has no effect on its ontological status. 
Indeed, there is complete parity in the ontological status accorded to each 
time, whether it be in the past, present, or future. (d) The diagram shows us 
how to represent constraints on the permissible Space-Time Paths. The 
light cones, for example, restrict the possible shapes of these paths. We can 
see causal laws, in general, as providing us with such constraints. That is 
why quantum indetermininism represents no threat to the view of time 
given by Minkowski. The absence of determininistic causal laws in certain 
circumstances would not mean! that the future, being ‘open’, is not yet 
‘really there’, and thus has a different ontological status from the present. 
Rather, such pockets of indeterminism would imply a relaxing of the 
restrictions on the curves possible on certain Space-Time Paths. 

With this discussion of the Minkowskian geometry of Space-Time our 
sketch of the picture of time that dominates contemporary thinking is nearly 
complete. It remains to mention the fact that although in Special Relativity 
there are multiple inertial frames, no one of these is considered 
‘privileged’—i.e. no one inertial frame is singled out as giving the true 
simultaneity relations between events. Indeed, the principles themselves of 
Special Relativity are invariant across inertial frames; the Theory itself is 
equally true, and equally applicable, at each reference-frame. A useful 
comparison can be made, at this point, between Einstein’s treatment of 
inertial, or reference, frames, and Kripke’s treatment of ‘modal-frames’, 
otherwise known as possible worlds. The lead idea of Kripke’s semantics for 
modal logic is that ‘‘necessary p”, or “‘(_]p’’, is true at a possible word, a, just 
in case “‘p” itself is true in all possible worlds accessible to « (Kripke [1963)). 
By considering different accessibility relations (in regard to reflexivity, 
symmetry, and transitivity), Kripke was able to outline different modal 
systems, in some of which the truth of modal propositions, like “‘[]p’’, 
varied from world to world. In the modal system S5, however, which is 
considered by most to reflect the pure, or purely metaphysical, conception of 
modality, the axioms force accessibility to be a full equivalence relation, 
with the consequence that the truth-value of modal propositions is invariant 
across ‘modal frames’. David Lewis (Lewis [1973]), in effect, exploits this 
similarity between the invariants in Special Relativity and in the Modal 
system S5, to produce a metaphysics of modality isomorphic to the one we 
have seen accorded to time. What is interesting is that Lewis’ opponents, 
while (presumably) accepting the Minkowski geometry and metaphysics of 
time, as well as Kripke’s semantic interpretation of the Modal System Ss, 
have rejected Lewis’ symmetrical metaphysics of S5, and hence, modality. 
How is this possible? 


1 Adolf Griinbaum argues this point persuasively, against Hermann Bondi. See (Griinbaum 
[1967)). 
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3 THE SEMANTICS AND METAPHYSICS OF 
“THE ACTUAL WORLD” 


Robert Stalnaker (Stalnaker [1976]) has distinguished Lewis’ semantical 
theory of ‘‘actual’’ from the metaphysical account that Lewis derives from it. 
Lewis’ ‘indexical’ account of actuality is another instance of applying the 
Perry theory of indexicals we looked at earlier in regard to “now”. We can 
sketch such a semantic theory thus. In any particular occasion of use of an 
indexical, distinguish three elements: index, icon, and referent. The index 
represents the relevant aspect of the occasion of use, the context, that goes to 
determine reference. The index for “I” is the speaker, for “now”, the time 
of the utterance, for ‘‘actual’’, the world of that utterance. The icon is the 
actual token of the indexical, or the token speech act in which the term is 
uttered. The referent, then, is the item referred to by that icon at that index. 
The referent of the icon “‘the actual world”’, then, at index a, is world a. This 
account in terms of the structural relations between index, icon, and referent 
gives only the semantics of indexicals, however. Our metaphysical theory 
should tell us such things as: what the indices are, and which indices exist. 

Concerning the metaphysics of actuality Stalnaker argues that since, 
according to Lewis himself, worlds are “ways things could have been”, a 
possible world should not be seen as a concrete object, but rather as a 
property of all the objects—an abstract entity, a universal, that gives us a way 
things could be disposed. Stalnaker then exploits a traditional two part 
distinction in metaphysics: particular/universal and existence/instantiation. 
Particulars can exist or fail to exist, but universals (properties) even when 
they exist can still either be instantiated (obtain, be actual) or fail to be. For 
Stalnaker, all the possible worlds, as abstract properties, exist, but the 
question arises which worlds are instantiated (are actual). It is on this 
question that Stalnaker’s metaphysics of modality contrasts with Lewis’. 
According to Stalnaker, only one world (the actual world, or «, to give it a 
name) really obtains. On Lewis’ ‘extreme realist’? position, however, 
the notion of a world’s obtaining is not an ‘absolute’ one at all. For Lewis, 
each possible world obtains, is actual, relative to itself, while no world is 
‘really’, or ‘absolutely’, actual. The ‘worlds are thus ontologically 
indistinguishable—each exists, and each is actual relative to itself. This 
precisely parallels the Minkowski picture of time. All times equally exist, 
and each is ‘now’, or the present, relative to itself. The Stalnaker view, by 
contrast, represents an asymmetrical treatment of time and modality. 
Further, we should note that Stalnaker implicitly recognizes a distinction 
between intra-systemic and inter-systemic facts. An intra-systemic fact gives 
us how some world $ represents things as being. According to f, one might 
say, Reagan is a Democrat, the sky is red, etc... . The inter-systemic facts 
fall into two parts. (1) Facts about the general structure of modality. These 
are the necessary truths about the nature of possible worlds and the 
semantics of modal logic that Kripke and Stalnaker have urged on us. (2) 
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The (contingent) facts about which world(s) obtain. The theory of modality 
espoused by Stalnaker requires us to separate inter-systemic facts of type (2) 
ftom `the intra-systemic facts peculiar to each possible world, since it is 
supposed to be an ‘objective’, ‘absolute’, fact about a world, $, that it does or 
does not obtain. 

What, then, are Stalnaker’s arguments against Lewis’ extreme realism 
about possible worlds? He offers basically two kinds of consideration. The 
first is that there is an absolute, or ‘neutral’, modal-frame—to wit, the actual 
world. This, for Stalnaker, is the privileged modal-frame. It is true, he 
argues, that if one were to utter “‘the actual world” at index, $, then $ would 
be referred to. But if $ is not the actual world it does not obtain. The only 
index that obtains is the actual world, and hence it has the special privilege of 
being the only context where the semantical account of actuality can actually 
be applied. What can one say about this argument? The most 
straightforward reply is that it is simply question-begging. We ask to be 
given a reason for believing that one modal index is privileged over the 
others, and we are simply told that one is privileged —viz. the actual world— 
i.e. &. Using the semantics for ‘“‘actual”’ that Stalnaker has endorsed, we find 
the reference of ‘“‘the actual world” here, at world a, to be «. So, we are told to 
believe that « is privileged, that it alone obtains, but we are not told why we 
should believe this. No one would accept this kind of reasoning if it were 
used to support temporal chauvinism, or “inertial chauvinism”. So why 
should we accept it here? 

The emptiness of Stalnaker’s first argument for modal chauvinism 
suggests that the weight of his conclusion really rests on his second 
argument. This proceeds as follows. Unlike solipsism, which represents a 
metaphysical position with genuine content, extreme modal realism is 
without content. The solipsist singles out one index for special treatment, 
while others fail to find just one index in a unique ontological position. With 
“the actual world”, however, according to Stalnaker, the situation is not 
analogous. He urges that ‘“‘the actual world”’has the same force as “Reality”, 
or “all there is”, and that no one can reasonably deny that: all there is 
(or Reality) is all there is. 

What is going on here? Stalnaker is using ‘‘the actual world” in two 
different senses, and is accordingly conflating intra-systemic with inter- 
systemic facts. By the semantical account of ‘“‘actual’’ that Stalnaker accepts, 
“the actual world”, given that we are at index a, refers (simply) to world a. 
But although it is a tautology that: the actual world obtains, it is equally 
tautologous at every possible world. No special privilege accrues to our 
world, a, merely from the relativistic semantics of “‘actual’’. Stalnaker, 
however, wanted to use a neutral, objective, inter-systemic notion’ of 
obtaining and to argue that in this sense the actual world alone obtains. But, 
as we have seen, in this sense, it just isn’t tautologous that the actual world 
(i.e. a) obtains. Stalnaker could choose to use “‘the actual world” in a new 
sense, to mean: “the totality of inter-systemic facts, type 2), about all the 
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possible worlds”. But then although it would be trivially true that: the actual 
world is all that obtains, it would no longer mean that: « is the only possible 
world that obtains.’ 

It is indeed far from clear just how one should analyze the difference 
between intra- and inter-systemic facts. That g obtains is an inter-systemic 
fact about a. But it might not have obtained—a might not have been actual. 
How shall we represent this fact? Shall we say that in some ‘super-world’ ¢, 
« does obtain while in y, it does not? This would only postpone our problem, 
since although @ obtains, it too might not have. And how shall we represent 
this? However we resolve the dilemma, it remains that Stalnaker, like other 
modal chauvinists, requires an inter-/intra-systemic distinction, and that 
his second argument against Lewis turns on collapsing this very contrast. 
We will have to look elsewhere if Lewis is to be refuted. 

Peter Van Inwagen has advanced different arguments against Lewis’ 
symmetrical treatment of time and modality. (Van Inwagen [1980]). 
According to Inwagen, it is inaccurate to treat ‘“‘actual’’ as an indexical at all. 
Within a world, say «, a genuine indexical, like “I” or “now”, will vary its 
reference from context (index) to context. These are taken to be the 
paradigmatic indexicals. If “actual” is taken to be an indexical, this will 
represent an unmotivated, gratuitous, extension of the notion of 
indexicality, and render the thesis of the indexicality of actuality without 
interest. Inwagen’s idea is that if “the actual world” is taken to be an 
indexical because its reference shifts from world to world, then the same 
reasoning will apply to such terms as: “‘the President of the U.S. in 1984”. 
But phrases of the latter sort are paradigms of non-indexical terms, so that in 
likening “the actual world” to them nothing interesting is being claimed 
about its status as an indexical. 

This argument of Inwagen’s admits of the following reply. The semantics 
of modal logic, as introduced by Kripke, has precisely widened the scope of 
the theory of indexicals. In addition to the traditional indices of person, 
time, and place, we are now asked to consider the index: possible world. 
Naturally, once we admit the new indices into our theory, we leave it open 
that we will widen the class of indexicals. Moreover, terms like ‘‘the 
President of the U.S. in 1984” are not simply labelled indexicals; they are 


' One might try to defend Stalnaker by insisting that: (a) possible worlds be considered total, 
and (b) that they not be considered co-obtainable. It should be clear by now, however, that this 
artful dodge will not suffice. Concerning (b), we observe that it simply begs the question 
against Lewis. Moreover (b) is no more obvious than the corresponding principle for times: 
just because each time, t, is a possible time for events to occur doesn’t mean that events can’t 
‘really’ occur at different times (the Minkowski picture). Perhaps (b) relies on (a). But (a) also 
won’t work. If by “total” we understand (in Wittgenstein’s phrase): “everything that is the 
case”, then we have seen that possible worlds can’t be total. For it was to be a separate 
(contingent and ‘objective’) fact about a world that it obtains. Moreover, if we insist that the 
actual world, a, does contain this fact, then, unfortunately, so will (for the corresponding fact) 
world, 8. And that brings us back once again to Lewis. 


EE 
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held to be equivalent to: “the actual President of the U.S. in 1984”, and 
these terms will clearly shift their referents from world to world, depending 
on who at that world is President. True, many more terms will thus turn out 
to be indexicals, but this is hardly surprising given the nature of the new 
indices—i.e, the possible worlds. What is necessary is to show, rather, that it 
is impermissible in the first place to extend the class of indices, or contexts, 
to include possible worlds. Inwagen does indeed have an argument for this 
thesis. 

According to Inwagen, it is a presupposition of genuine indexicals, like 
“now”, that each utterance of this term occur at a unique time. We can then 
say that the reference of this utterance, or icon, is the time of its occurrence. 
But since an utterance of “‘the actual world” can occur in many different 
worlds, there is no such thing as the world in which the utterance occurs. Let 
us grant (contra Lewis) the thesis that a particular utterance of “‘the actual 
world” can occur in different worlds. Does it follow that no clear sense can 
be made of this term as an indexical? Why can’t one simply specify that: for 
any world, w, if an utterance of “‘the actual world” occurs in w, then in w this 
term refers to w? It is true that we can’t simply specify the referent of this 
icon in terms of the (world) context of its occurrence. But we are free to 
specify that its referent at a world, is that world itself. And this indeed is 
Lewis’ theory. These first two arguments of Inwagen’s, then, are not very 
persuasive. His third argument, however, is more interesting. 


4 REFERRING TO TIMES AND WORLDS 


Inwagen (as well as Forbes and Davies (Forbes [1983]), (Davies [1983]), 
raises the question of how we can refer to the actual world. Since it is 
epistemically indistinguishable from an infinity of similar possible worlds, 
how do we succeed in having this one unique world in mind to refer to? I 
believe that this is indeed a problem for the theory of modality, and in 
general for the theory of reference, and I have argued elsewhere that it 
threatens the validity of Perry’s analysis of indexicals (Yourgrau [1982]). 
But our question here is simply: is there reason to believe that the Perry- 
style analysis of indexicals works for “P” and “now”, but fails for ‘actual’? I 
cannot see that there is. Perry does not impose any epistemic constraints on 
how the reference of “now” is determined: we simply note that an utterance 
of “now”, at t, refers (or refers then) to t. Similarly, the analysis tells us that 
an utterance of “the actual world” at world $, refers (in that world) to $. The 
cases are entirely symmetrical. 

What if one goes beyond Perry and does impose some sort of epistemic 
requirement on reference in general, and in particular in regard to 
indexicals? Is one justified in imposing this requirement unequally between 
indexicals like “I” and “now”, and those like “actual”? This is just what 
Inwagen (as well as Forbes and Davies) does. It is put forward that with 
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terms like “I”, “here”, and “now”, one must have some sort of ‘rapport’, 
epistemically speaking, with one’s referent. It is then claimed that no such 
episternic connection is possible with respect to the actual world (we cannot 
single it out from other possible worlds by pointing, for example), and that 
therefore (1) no such requirement is appropriate for ‘‘actual’”’. This 
asymmetrical attitude toward the requirements of indexical reference is, 
however, entirely without support. If there are epistemic constraints on 
indexical reference, then there is no more reason to believe they hold for 
times than for worlds. If the constraints are too stringent for the typical 
utterer of ‘‘actual’’ to satisfy, then we are forced to conclude that either (a) 
we really cannot use the term ‘‘actual” to refer to « after all, or (b) our general 
account of the epistemology of reference is faulty. Although there is a 
legitimate question, therefore, concerning our ability to refer to possible 
worlds, including the actual world (a), this fact does not argue in favour of an 
asymmetrical treatment of tense and modality. 

One might try to argue that at least concerning one possible world, the 
actual one, one can easily satisfy any reasonable epistemic requirements on 
reference. One can refer to a because it can be picked out as: the unique 
world which has the property of obtaining, of being actual. The problem, 
however, is that while this may solve the question of reference to q, it hardly 
can be used to support the thesis of modal chauvinism. For the question 
remains whether there is a unique world which obtains, and this account of 
our ability to refer to æ simply presupposes a certain answer to this question. 
Moreover, leven as a theory of reference this account has difficulties. What 
kind of property is obtaining? If it is to be the classical, ‘complete’, kind, 
captured by a Fregean sense, then the truth-value of ‘‘« is actual” should be 
context-insensitive. But as we saw earlier, a might not have been actual. In 
other circumstances, 8 might have been actual. Thus, the actuality of a does 
not represent the complete ‘Fregean’ property we were after. (This is of 
course just the modal analogue of the difficulty of defending McTaggart’s 
A-series by treating “‘is present” as representing a genuine ‘property’ of 
times—a difficulty that led philosophers to accept the Perry-style theory in 
terms of roles.) The issue of reference to times and worlds just does not seem 
to provide fuel for Lewis’ opponents. Is there any argument that does count 
against Lewis, and in favour of modal chauvinism? We conclude by 
examining a final argument concerning our movement through time, but not 
through ‘modal space’. 


5 MOVEMENT IN TIME AND MODALITY 


In a recent article, Martin Davies has agreed with Graeme Forbes that: 
“The significant difference [between tense and modality] ... is that we move 
through time and from place to place, but we do not move through the other 
possible worlds in which we exist” (Davies [1983]: p. 131). Even if this 
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view were correct, however, that we move through time but not modality it is 
far from clear what its relevance would be concerning Lewis’ indexical theory 
of actuality. Tyler Burge, for example, has offered an indexical account of 
truth (Burge [1979]) and proper-names (Burge [1973]), and he (Burge 
[1984]) and Tony Anderson ([1983]) and Fred Dretske (Dretske [1981]) 
have presented indexical treatments of knowledge, in none of which does the | 
concept of ‘movement’ play a role (though for some criticisms of this ap- 
proach to Knowledge see (Yourgrau [1983]). More importantly, however, 
Forbes and Davies seem to ignore the way movement in time is under- 
stood in the Minkowski geometry. As we recall from our earlier discussion, 
in the Minkowski account something ‘moves’ through time when it is at 
place p at time t, and at p’ (#p) at t (4t). The analogue for ‘movement’ 
through ‘modal space’ would then be: x occupies Space-Time Path P in 
world a, and Space-Time Path P’ (#P) in world « (+a). The analogy seems 
complete. If one is tempted to observe that this ‘movement’ through Modal 
Space is not really movement at all, since x simply occupies different Space- 
Time Paths at different contexts, then he should remind himself that the 
same considerations would invalidate Minkowski’s treatment of ‘movement’ 
through time. In the one case a dynamic process is represented 
(paradoxically?) by a static model, in the other, modality is represented 
(paradoxically?) by an extensional conception. We could, indeed, give the 
modal analogue of a Minkowski diagram, in terms of a ‘Modal Path’ through 
a five-dimensional space: 


Pw 


sTP 


Here, the STP axis represents x’s Space-Time Paths, and the PW axis gives 
us the possible worlds in which x exists. (We saw earlier that each particular 
total Space-Time Path could be represented as a set of ordered quadruples 
of real numbers [and each such set in turn could be reduced to a single 
‘Gédel-number’].) This ‘Modal-Minkowski-Diagram’ offers a nice picture 
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of things, but one asks: what is the domain of the Possible Worlds Axis? Is 
the set of all possible worlds really ordered? Note, first, that the question is 
not one of the correct ordering of this set, but of an ordering sufficient for 
our purposes. [Even the continuum, for example, only gives us an ordering 
of the real numbers.] Given that this set is (at least!) of the cardinality of 2°, 
the assumption of ordering will involve commitment to principles as strong 
as the Axiom of Choice. But are there such orderings? Lewis assumes there 
are various interest-relative orderings of the worlds in regard to 
‘comparative similarity’. He also observes that one can easily construct at 
least one such ordering by adopting an expedient of Quine’s. The idea is to 
treat the worlds as 


“. .. equivalence classes, under certain transformations of co-ordinates, of mappings 
from the set of all quadruples of real numbers to o (for the quadruples which are the 
co-ordinates of points unoccupied by matter) and 1 (for the co-ordinates of occupied 
points). We might define the distance between any two of these mappings as the 
hypervolume of the set of quadruples on which the two mappings differ in value; and 
we might next define the distance between any two sets of these mappings . . . as the 
greatest lower bound on the distances between a mapping in one set and a mapping in 
the other. It remains only to equate the similarity ‘distance’ between two worlds with 
the defined distance between the two corresponding ersatz worlds.” (Lewis [1973], 


P. 94). 


Such artificial expedients as these may give us pause, but we should not be 
over-impressed with the air of artifice. Possible worlds are just too dizzingly 
complex for us (now—or perhaps ever) to have any intuitive grasp of more 
than a minimum of the structure of their relations to each other. If even 
Cantor’s sequence of ascending transfinite cardinals beggars the 
imagination, how much more so must the scheme of whole worlds. All that is 
needed, however, for the notion of a modal-Minkowski-Diagram to have 
any purchase is that there be an ordering for the PW axis, and although the 
matter is still open, no strong arguments have been shown that no such 
orderings exist. (Indeed, even the arch modal chauvinist, Stalnaker, has 
committed himself to a partial ordering) (Stalnaker [1968]). 

Are there any constraints on permissible Modal Paths, any analogue to the 
light cones of the Minkowski diagrams? We leave this question open. It will 
turn on the much-vexed issue of Trans-world Identity, and the question of 
which properties individuals possess essentially. What is clear, however, is 
that although one cannot literally have any causal effect on other possible 
worlds, the way causality is treated in the Minkowski diagrams (in terms of 
constraints on possible Space-Time Paths) has a coherent analogue in the 
notion of constraints on possible Modal Paths. (For example, it may turn out 
that there is no Modal Path which includes a Space-Time Path for Reagen, 
in some world, f, that resembles that of a photon, in our world, æ.) Once 
again, we have failed to find a significant disanalogy between tense and 
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modality, and hence no argument in favour of modal chauvinism that would 
not count equally for temporal chauvinism.’ 


6 CONCLUSION: IS LEWIS CORRECT? 


We have seen that Lewis’ account, though counter-intuitive, is coherent, 
while the modal chauvinist has trouble demonstrating the plausibility, and 
even the coherence, of his theory. Either, therefore, Lewis is correct, or else 
the whole modal framework of Kripke and others, in which we are permitted 
.to quantify over worlds, is misconceived. Perhaps we can gain some insight 
into the latter alternative if we observe that the heart of the dilemma faced by 
the modal chauvinist, the inter/intra-systemic distinction and the absence of 
a privileged reference frame, are characteristic dilemmas of type-theoretical 
phenomena. These are just the kinds of difficulties that prompted, and still 
beset, Russell’s Theory of Types and Tarski’s Hierarchy of Formal Truth- 
Predicates, insofar as these offer to present the whole truth about “truth”. A 
solution along type theoretical lines would, however, threaten the very 
notion of a possible world, insofar as worlds purport to be ‘total’. An even 
more radical approach would be to mimic that of Tyler Burge’s ‘indexical’ 
theory of truth (Burge [1979]). There are two crucial features to Burge’s 
theory: (x) the ordinary, natural-language, predicate “true” is treated as an 
indexical (“‘true,’’), having different extensions in different contexts (so that 
the structure of its use is isomorphic to Tarski’s formal hierarchy of truth- 
predicates), and (2) the semantics of the theory does not permit us to 
quantify over the possible indices of the truth predicate. Every sentence that 
is true, is true,, for some i, but we are not ‘formally’ permitted to quantify 
over the position occupied by i. The general semantics is provided by 
treating the laws concerning truth as being what Burge calls ‘schematic- 
generalisations’: “P is true’. We are supposed to “‘see” that the case is 
general (we are free to substitute any particular value for i), but we must re- 
frain from quantifying over indices, i (lest the liar-style paradoxes reappear). 
Now, whatever the prospects for this theory as an account of truth (and 
I have serious misgivings, here) (see T. Anderson’s reservations at the 
conclusion of Anderson [1983]), it does offer us another model for 
attempting a solution to the ‘type-theoretical’ problems faced by the 
possible worlds framework. 


1 I have not dealt with the question of the unintuitive consequences of Lewis’ theory because 
these seem to me not to go to the heart of the matter. Nevertheless a word may be said about 
whether the modal chauvinist really fares any better than Lewis on the matter of paradoxical 
consequences. On Lewis’ view, no matter what actions I perform here, at «, I can have no 
effect, for example, on the amount of suffering, or ignorance, that exists, for on his account 
all the actions of my ‘counterparts’ have consequences as ‘real’ as mine. This is indeed a 
disturbing development of the theory. But note that for the modal chauvinist there is an 
equally unintuitive result, that we at « cannot ourselves choose to actualize our world (since 
all action is ‘within’ a world, and thus cannot effect the intersystemic result of making the 
world itself actual), and hence cannot ‘really’ have any effect on how things will turn out. 
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It may be, then, that the doctrine of modal chauvinism will force us into a 
deeper analysis of the basic structure of modality. The original picture of 
‘total worlds’ may have to be abandoned. It remains to be seen whether the 
thesis of chauvinism will survive such a transformation. What has emerged, 
however, is that it cannot thrive along the lines of the traditional account. 


Barnard College, Columbia University 
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ABSTRACT 


The rule to maximize expected utility is intended for decisions where options involve 
risk. In those decisions the decision maker’s attitude toward risk is important, and 
the rule ought to take it into account. Allais’s and Elisberg’s paradoxes, however, 
suggest that the rule ignores attitudes toward risk. This suggestion is supported by 
recent psychological studies of decisions. These studies present a great variety of 
cases where apparently rational people violate the rule because of aversion or 
attraction to risk. Here I attempt to resolve the issue concerning expected utility and 
risk. 

I distinguish two versions of the rule to maximize expected utility. One adopts a 
broad interpretation of the consequences of an option and has great intuitive appeal. 
The other adopts a narrow interpretation of the consequences of an option and seems 
to have certain technical and practical advantages. I contend that the version of the 
rule that interprets consequences narrowly does indeed neglect attitudes toward risk. 
That version of the rule excludes the risk involved in an option from the 
consequences of the option and, contrary to what is usually claimed, cannot make up 
for this exclusion through adjustments in probability and utility assignments. I 
construct a new, general argument that establishes this in a rigorous way. On the 
other hand, I contend that the version of the rule that interprets consequences 
broadly takes account of attitudes toward risk by counting the risk involved in an 
option among the consequences of the option. I rebut some objections to this version 
of the rules, in particular, the objection that the rule lacks practical interest. Drawing 
upon the literature on ‘mean-risk’ decision rules, I show that this version of the rule 
can be used to solve some realistic decision problems. 


The rule to maximize expected utility (MEU) purports to yield rational 
decisions when options involve risk. It has been criticized, however, for 


* I am indebted to the referee for several helpful suggestions. Also, I benefited from discussions 
of earlier versions of this paper at the 1984 Eastern Division Meeting of the American 
Philosophical Association and at the 1985 Ohio State University Working Conference on 
Decision Theory. I would like to acknowledge especially the comments provided by Douglas 
MacLean at the APA meeting and by Ed Green at the OSU conference. 
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neglecting to take account of a decision maker’s attitude to risk. Recent 
psychological studies, for example, ones by Daniel Kahneman and Amos 
Tversky [1979], have intensified the issue. These studies argue, among 
other things, that certain systematic and widespread violations of MEU are 
caused by aversion or attraction to risk. It seems one must maintain that 
these violations of the rule are due to egregious irrationality, or else concede 
that the rule fails to take account of a decision maker’s attitude to risk.! 

In this paper I try to resolve the issue about MEU and risk. I do this by 
distinguishing two common versions of MEU. One I yield to the critics. 
The other I defend. Most of the considerations that arise are not new. But I 
attempt to present them with greater clarity and precision. For example, I 
distinguish aversion to risk in the nontechnical sense involved in the 
objections to MEU from aversion to risk in the technical sense involved in 
certain defenses of MEU. As a result of such efforts, I hope that the 
arguments against one version of MEU and for the other are more 
compelling. 

Let us begin by reviewing the history of the debate about expected utility 
and risk. In the middle of the century John von Neumann and Oskar 
Morgenstern ([1944] and [1947]) revolutionized utility theory by applying 
measurement theory to utility. They introduced various axioms of 
preference and, in accordance with the procedures of measurement theory, 
showed that given those axioms, utility can be precisely defined. Many 
thought that the axioms of preference they introduced justified MEU. But 
others, such as Maurice Allais ([1953] and [1979]), criticized those axioms 
and argued that MEU neglects aversion to the risk involved in an option 
when the possible consequences of the option have widely divergent 
utilities. Decision theorists entertained handling Allais’s criticism by 
interpreting consequences broadly so that the risk involved in an option is 
included among its consequences along with monetary gains and losses etc. 
But most objected that this tactic would be ad hoc, would double count risk, 
would block the application of measurement theory to utility or would make 
assessing the utilities of consequences so difficult that the rule to maximize 
expected utility would be useless.” 

While the controversy simmered, new suspicions that MEU mishandles 
risk arose. Notably, Daniel Elisberg [1961] raised suspicions that the rule 
neglects aversion to the risk involved in an option when the probabilities of 
the states of the world determining its consequences are based on slender 
evidence.? On the other hand, Leonard Savage [1954] and others extended 
Von Neumann and Morgenstern’s approach to both probability and utility. 


1 Actually there is another possibility. MEU assumes certain ideal conditions, such as 
knowledge of all logical truths (see note 6). Real choices are not made under these ideal 
conditions, even when they are made under ‘laboratory’ conditions. Thus violations of MEU 
may be attributed to nonideal conditions rather than irrationality. 

2 Harry Markowitz ([1959], Chp. X) gives a good accpunt of some of these arguments. 

3 Ellsberg speaks of aversion to uncertainty, which he distinguishes from aversion to risk. 
However, aversion to risk in our sense includes what he regards as aversion to uncertainty. 
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In the process they introduced new axioms of preference that seemed to 
many to justify MEU even more solidly than Von Neumann and 
Morgenstern’s axioms.! 

Our attempt to resolve the dispute focuses on the interpretation of 
consequences in MEU, Various classical and contemporary formulations of 
MEU, from Daniel Bernoulli’s [1738] to Richard Jeffrey’s [1965], count all 
the results of an option as consequences of the option. In contrast, as we will 
see, Von Neumann and Morgenstern’s and Savage’s formulations of MEU 
exclude some of the results of an option from the consequences of the option. 
I call MEU with consequences taken broadly MB, and MEU with 
consequences taken narrowly MN. I argue that MN does indeed neglect 
attitudes toward risk, whereas MB does not. 

My case against MN appeals to the paradoxes of Allais and Ellsberg, but 
my principal argument against it uses general considerations rather than 
intuitions about particular cases. My arguments for MB show that taking 
consequences broadly so that risk is included is sound. They show that the 
inclusion of risk is not ad hoc and does not double count risk. Also they show 
that the inclusion of risk is methodologically satisfactory. They show that it 
does not interfere with the application of measurement theory to probability 
and utility, or stand in the way of practical problem solving. 

The arguments against MN, which are presented before the main 
arguments for MB, assume that MB is correct (or, more precisely, that MB 
is correct if MN is correct). Using this assumption before it is argued is 
convenient for expository purposes. And it does not prejudice the case 
against MN. Most proponents of MN prefer MN to MB, not because they 
think that MB is false, but rather because they think that MN has certain 
methodological advantages.” 


X TWO VERSIONS OF MEU: MN AND MB 


MEU as we take it here is a normative rule. It does two things: (a) it defines 
the expected utility of an option and (b) it says to form preferences among 


1 Fora general review of the literature on expected utility and risk, see Paul Schoemaker [1982], 
Bernt Stigum and Fred Wenstap [1983], and Olen Hagen and Fred Wenstep [1984]. For 
surveys of the relevant psychological literature, see Paul Slovic et al. [1977], Robin Hogarth 
([1980] 

Chp. 4), Paul Schoemaker [1980], Hillel Einhorn and Robin Hogarth [1981], John Hershey et 
al. [1982], and John Payne [1982]. And for a review of applications of measurement theory to 
probability and utility, see Peter Fishburn [1981]. 

? For example, Harry Markowitz ([1959], p. 225), Howard Raiffa ([1968], p. 85 f.), Amos 
Tversky ({1975], p. 170 ff.), and Mark Machina ([1981], p. 173) object that interpreting 
consequences broadly would make MEU unattractive because of methodological 
considerations. They do not object that interpreting consequences broadly would make MEU 
false. 

. Those who ascribe to an operationalistic theory of meaning (in contrast to an 
operationalistic methodology) would object that classical formulations of MB are 
meaningless. But even they would not object that all formulations of MB are meaningless. 
Jeffrey’s formulation, for instance, meets whatever operationalistic standards MN meets. 
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options, and ultimately to decide, according to expected utilities.1 We will 
focus on the definition of expected utility. It says that the expected utility of 
an option is a probability weighted average of the utilities of the possible 
consequences of the option. More precisely, let o be an option, and let s4, s2, 

. , 5, be mutually exclusive and exhaustive possible states of the world that 
are causally independent of o and determine the consequences of o. And for 
each state s; let P(s;) be the probability of s; and let U(C[o, s;]) be the utility 
of the consequences of o given s; Then according to MEU, the expected 
utility of o is XP(s,)U(C[o, s,]). 

(Some authors generalize this standard definition of expected utility to 
accommodate ' possible states that are not causally independent of options. 
But since such generalizations are controversial, and are not important in the 
dispute about expected utility and risk, I pass over them.) 

MN is obtained by adopting certain interpretations of the key terms in the 
foregoing definition of expected utility. Specifically, MN presumes an 
account of probability and utility derived from certain axioms of preference. 
According to this account, probability and utility assignments exist for a 
person only when his preferences satisfy the axioms. And given satisfaction 
of the axioms, the assignments (on appropriate scales) are the probability 
and utility functions (on the scales) according to which his preferences 
maximize expected utility. As a result, if the axioms are satisfied, pref- 
erences necessarily maximize expected utility. Therefore if rationality re- 
quires satisfying the axioms of preference, it also requires satisfying MEU. 

The technical interpretation of probability and utility used in MN protect 
the rule from many putative counter-examples involving attitudes toward 
risk. In virtue of these technical interpretations, a genuine counter-example 
has to present rational preferences that violate the axioms of preference, or 
equivalently, are such that there are no assignments of probabilities and 
utilities according to which the preferences maximize expected utility. A 
genuine counter-example cannot just provide some plausible probability 
and utility assignments and show that because of attitudes toward risk it is 
not irrational to form preferences, or make choices, contrary to the expected 
utilities obtained from those assignments. For those assignments might be 
adjusted in the light of attitudes toward risk until some assignments of 
probabilities and utilities are found that yield expected utilities agreeing 
with the preferences or choices.” And the existence of any such assignments 
vindicates MN. 


1 Our discussion of MEU assumes the usual idealization for the rule. That is, it assumes that 
the decision maker knows all logical truths, that he is certain about his beliefs and desires, and, 
in general, that conditions are perfect for maximizing expected utility. 

2 Markowitz [1952] shows how some apparent counter-examples to MN can be handled by 
manipulating utility assignments to accommodate risk. The utility assignments obtained save 
MN even if intuitively the assignments merge attitudes toward risk with the likelihood and 
desirability of consequences. See Clyde Coombs ([1975], p. 65 f.) and Paul Schoemaker 
({1980], Sec. 2.2, and [1982], p. 533 ff.) for some further discussion of such manipulated 
utility assignments. 
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The technical interpretations of probability and utility do not make MN 
unassailable, however. Suppose that a person has preferences that violate 
the axioms of preference. Then according to the technical interpretations, 
he lacks probability and utility assignments, and options do not have 
expected utilities for him. Consequently, his preferences among options do 
not maximize expected utility, and he violates the rule. Thus one can refute 
MN by producing a case in which rational preferences violate the axioms of 
preference, or equivalently, by producing a case in which rational 
preferences are such that they do not maximize expected utility according to 
any probability and utility assignments.’ 

MN also presumes a technical interpretation of the consequences of an 
option. Let us introduce it by contrasting it with two more familiar 
interpretations. Sometimes when people speak of the consequences of an 
option, they mean what would be caused by the option if the option were 
realized. They do not include the option itself. We will call the consequences 
of s given s in this sense the effects of o given s. Other times when people 
speak of the consequences of an option, they mean everything that would be 
attributable to the option if the option were realized. They do include the 
option itself. We will call the consequences of o given s in this sense the 
results of o given s. In order to be inclusive and avoid neglecting factors that 
influence a rational evaluation of an option, MN takes consequences as 
results rather than effects. However, it must qualify this way of taking 
consequences in order to accommodate its interpretations of probability and 
utility. 

The axioms of preference used to define probability and utility for MN 
impose certain restrictions on consequences. For example, Von Neumann 
and Morgenstern’s axioms of preference, under their intended 
interpretation, assume that any probability mixture of consequences is an 
option ([1944], p. 26 f.).? It follows that a possible consequence can be 


1 The most questionable axioms are the so-called ‘independence’ or ‘sure-thing’ axioms. These 
axioms require, for instance, indifference between a sure-thing and a lottery with the same 
expected utility according to MN. For some doubts concerning these axiom, see Clyde 
Coombs ([1975], p. 66 ff.), Paul Schoemaker ([1982], p. 541 ff.), Edward McClennen [1983], 
and Lanning Sowden [1984]. The first two works focus on the descriptive rather than the 
normative shortcomings of the axioms, but are suggestive from a normative point of view. 
‘The last two works concentrate on the normative shortcomings of the axioms. 

‘The so-called ‘structural’ axioms are also highly questionable. They require a rich, 
‘Archimedean’ body of preferences that actual decision makers may lack. However, most 
decision theorists take these axioms as parts of the idealization for MEU (see note 6) rather 
than as requirements of rationality. Objections to these axioms must therefore show that they 
are in appropriate idealizations. And this makes objections to them less straightforward than 
objections to other axioms. 

In rejecting MN I reject the axioms that entail it. However, it should be noted that many of 
these axioms have plausible analogues when consequences are taken broadly so that risk is 
included. In particular, the sure-thing and independence axioms seem unobjectionable when 
consequences are taken broadly. In fact, MB, which I argue for later, entails the versions of 
the axioms obtained by taking consequences broadly. 

2 Options here are objects of preference. They are not necessarily available to the decision 
maker. 
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produced by a variety of options and that it can be produced with arbitrary 
probabilities. Savage’s axioms of preference assume something similar that 
yields the same conclusion concerning consequences. They assume that 
there is a set of possible states of the world and a set of possible 
consequences, and that any function from possible states to possible 
consequences is an option ([1954], end papers and p. 14 f.)! It follows that a 
possible consequence can be produced by a variety of options—by options 
that yield the consequence in every state and by options that yield it only ina 
single state. More specifically, it follows that a possible consequence can be 
produced with arbitrary probabilities by other options. We say that 
consequences meeting this condition are general rather than particular. And 
we take it as the distinctive feature of MN that it uses concepts of probability 
and utility introduced by axioms of preference that require that the 
consequences of an option be general in this manner. MN, therefore, 
in virtue of its interpretation of probability and utility, must take the 
consequence of an option to be just the results of the option that are general 
in the same way. 

Given the above qualification, the consequences of an option according to 
MN must exclude the risk involved in an option. For the risk involved in an 
option depends upon the probabilities of the other results. And hence the 
risk and the other results cannot be produced with arbitrary probabilities by 
other options. Take, for example, theoption of betting $1,000 on heads on a 
coin toss. And consider the results, including the risk run, if one bets and 
wins. These results cannot be produced with certainty by some other option. 
Another option might produce a $1,000 gain with certainty. But the results 
of such an option would not include the risk involved in the bet. 

Since the risk involved in realizing an option is a result of the option that is 
not general, I will sometimes say it is particular to the option. Not every risk 
involved in an option is particular to the option, however. An option may, 
for example, have as a possible result a lottery that can be produced with 
arbitrary probabilities by other options. Such a result is general, and the risk 
it involves is not particular to an option that might produce it. The risk 
involved in the result influences the risk involved in an option that might 
produce it. But its influence varies from option to option depending upon 
the other results that might be produced. The risk involved in the result is a 
risk involved in an option that might produce it. But it is not what we call the 
risk involved in realizing the option, or the risk particular to the option. 

In summary, MN involves a restriction on the results of an option that 
count as consequences. This is why we say its interpretation of 
consequences is narrow. Furthermore, the restriction disqualifies the risk 
involved in an option as a consequence of the option. And this constitutes a 
prima facte case that MN neglects attitudes toward risk in the evaluation of 
‘options. 


1 Options here are objects of preference, as in Von Neumann and Morgenstern’s axioms. 
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Although proponents of MN think that attitudes toward risk should 
influence preferences and decisions, they think that the rule does not err by 
failing to count the risk involved in an option as a consequence of the option. 
They think that attitudes toward risk influence preferences and decisions 
among risky options by influencing the assignments of probabilities and 
utilities of other results. For example, according to John Pratt [1964], 
Kenneth Arrow ([1970], Chp. 3), and Ralph Keeney and Howard Raiffa 
([1976], Chp. 4), in cases where the relevant other results are gains or losses 
of some quantitative commodity such as money or leisure time, aversion and 
attraction to risk influence the shape of the utility curve for the commodity.! 
These authors claim that aversion to risk makes a person’s utility curve 
concave down (have a decreasing slope), and attraction to risk makes it 
convex down (have an increasing slope). Later I will argue that MN does 
neglect attitudes toward risk, contrary to what these proponents of MN 
claim. 

Another line of defense for MN is to argue that rational attitudes toward 
risk are not basic but rather derivative from attitudes toward general 
consequences of options. If the argument is sound, then the utilities of 
consequence do not change when the risks involved in options are excluded 
from consequences: the risks are superfluous. However, proponents of MN 
cannot take this line of defense. Putting aside doubts about its accuracy, 
it rests on a claim that a basic aversion or attraction to risk is irrational, 
and proponents of MN do not want to rule out the rationality of any such 
basic aversion or attraction. They want to argue for MN on purely 
instrumentalistic grounds. In the arguments concerning MN, therefore, we 
will assume that attitudes toward risk are not derivative from attitudes 
toward general consequences of options. 

Now let us turn to MB. It is obtained by adopting different interpre- 
tations of the key terms in the definition of expected utility given above. 
Specifically, MB takes probability as rational degree of belief, utility as 
rational degree of desire, and consequences as all results. Here the main 
difference is that consequences encompass all results, including risk. This is 
why we say that MB adopts a broad interpretation of consequences. 

Formulations of MB more precise than the one just given can be obtained 
by adopting concepts of probability and utility more precise than rational 
degree of belief and desire. MB does not require any particular way of 
making probability and utility more precise. It excludes only methods that 
use axioms of preference requiring a narrow interpretation of consequences. 

To illustrate the range of possibilities, probability and utility for MB can 
be defined classically as in Bernoulli [1738] or psychophysically as in Allais 
[1953]. Moreover, they can be defined using the techniques of measurement 
theory. For instance, Richard Jeffrey [1965] and Ethan Bolker [1967] take 


! Hugh Chandler ([1975], Sec. 1) advances a similar view although he does not propound what 
I call MN. 
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the consequences of an option given a state as the conjunction of the option 
and state. Thus no results of the option given the state are excluded. Then 
they present axioms of preference that, together with an unboundedness 
condition for preferences, entail that there are unique probability and utility 
functions (on appropriate scales) according to which preferences satisfy a 
version of MB generalized to cover cases where the probabilities of states are 
not independent of the option adopted.! Given satisfaction of the axioms 
and condition, probability and utility assignments can be precisely defined 
as those functions. And then the axioms and condition entail the 
generalization of MB. If the axioms and condition are acceptable, they 
justify the generalization of MB. Therefore Jeffrey’s and Bolker’s work 
shows that adopting MB does not prevent applications of measurement 
theory to probability and utility. Their work shows that MB is not at a 
methodological disadvantage in this respect. It shows that the distinction 
between MN and MB is not a matter of congeniality with measurement 
theory but of the breadth of consequences.” 


2 RISK 


The concept of risk involved in our arguments that MN neglects attitudes 
toward risk is distinct from the concepts of risk involved in certain defenses 
of MN. Hence it is important to introduce our concept of risk carefully. As 
will be apparent, our concept is nontechnical, i.e., not designed for a 
mathematical theory of risk. However, it is sufficiently clear for our 
purposes here. In particular, it supports the arguments against MN, and it 
supports the principles of risk that will be used later in applications of MB. 

Risk in its ordinary sense is danger or the possibility (but not necessity) of 
harm. However in our arguments against MN, risk has an extended and 
refined special sense. The sense of risk is extended (1) so that risk includes 
the possibility of any loss, not merely harm, and (2) so that risk includes the 
possibility of gain or loss, not merely loss. Accordingly, any option whose 


1 The version of MEU that they use, their desirability axiom (Jeffrey ([1965], p. 70)), takes 
consequences broadly as MB does, but also introduces conditional probabilities in order to 
remove the requirement that states of the world be independent of options. MB follows from 
it by induction, assuming some restrictions that are required to make UB(o & s), the utility for 
an option-state pair in their version of MEU, equivalent to UB(CB[o, s]). For these 
restrictions, see Weirich ({1980], p. 713 f.) and note 26. 

For remarks on the special unboundedness condition, see Jeffrey ([1965], pp. 127 and 143). 
And for a more recent presentation of Jefffey’s and Bolker’s axioms, see Jeffrey ([1983a], 
Chps. 5-9). 

This reply does not assume that Jeffrey's and Bolker’s axioms are acceptable. It only assumes 
that the problems with their axioms, if any, are not due to taking consequences broadly. 

Also, to avoid misunderstanding, let me note that Jeffrey and Bolker do not advance their 
version of MEU to handle the problems with attitudes to risk. Jeffrey [19836], for example, 
says that the preferences advances in Allais’s and Ellsberg’s paradoxes are irrational. He is 
certainly not interested in making MEU accommodate those preferences. 


» 
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significant consequences are undetermined involves risk, even if the 
possible consequences are all possible gains so that there is no risk of loss. 
Given this, one’s attitude to risk is one’s attitude to taking chances. Also, the 
sense of risk is refined by stipulating that risk is the epistemic, not the 
physical, possibility of gain or loss. This stipulation is appropriate in 
decision theory since decision makers often do not know what is physically 
possible. It has the same motivation as the stipulation that probability in 
MEU is epistemic rather than physical. 

Options involve risks of various sizes. The size of the risk involved in an 
option depends on factors such as the size of the stakes. As we understand a 
decision maker’s attitude toward risk, it includes his attitudes toward risks 
of various sizes. An aversion to risk, for example, may produce preferences 
for small risks over large risks as well as preferences for sure-things over 
risky ventures. 

One’s attitude toward risk is distinct from one’s emotional experiences in 
the course of taking risks. Being averse to risk is distinct from feeling anxiety 
in the face of risk and intense regret when a risky decision turns out badly. 
And being attracted to risk is distinct from feeling excitement in the face of 
risk and intense delight when a risky decision turns out well. As Hugh 
Chandler ([1975], p. 225) observes, a drug might suppress the pleasant or 
unpleasant emotional responses to risk without eliminating one’s aversion or 
attraction to risk. 

The points about risk just sketched head off two lines of response to our 
criticisms of MN. The first line concedes that MN does not handle aversion 
or attraction to risk, and proposes indifference to risk as a restriction for 
MN.! This restriction is not too severe if aversion and attraction to risk are 
certain emotional responses to risk. However the restriction is crippling if 
risk is understood in our sense. A decision maker’s attitude toward risk in 
our sense is a central element of decisions where options involve 
uncertainty, not a peripheral emotional element. No rule that assumes 
indifference to risk in our sense is a satisfactory rule for such decisions. 

The second line of response asserts, as in regret theory, that Allais’s and 
Elisberg’s paradoxes are explained by the emotional results of chances for 
various monetary consequences, for instance, the disappointment of bad 
luck and the joy of good luck. Then it claims that these emotional results are 
general rather than particular since they might be produced with arbitrary 
probabilities by other options by means of drugs. Thus these emotional 
results are included among the consequences of an option according to MN. 
They are not ruled out, as the risk involved in an option is, by the 
requirement that consequences be general. The line of response concludes 
that MN produces sound recommendations as long as applications consider 


1 Schoemaker ([1980], p. 14), for example, says that Von Neumann and Morgenstern’s axioms 
rule out ‘any fondness or dislike for gambling for its own sake.’ 
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the emotional objectives of decision makers as well as their other objectives. 
And this can be accomplished by following the techniques for decision 
making with multiple objectives that are presented by Keeney and Raiffa 
[1976].* 

Our remarks about risk make it possible for us to block this line of response. 
Attitudes toward risk, as understood here, are distinct from the general 
emotional consequences of risky options. Thus we can construct versions of 
Allais’s and Ellsberg’s paradoxes where preferences are due to attitudes to 
risk rather than to general emotional consequences. Then even if regret 
theorists show that MN applied using multiple objectives handles the 
original versions of Allais’s and Ellsberg’s paradoxes, they do not show that 
MN handles these new versions of the paradoxes. 


3 CRITICISM OF MN 


Allais’s and Elisberg’s paradoxes are the most widely accepted reasons for 
holding that MN ignores attitudes toward risk. In this section I will present 
versions of the paradoxes that argue against MN as strongly as possible. 
Then I will supplement the argument from the paradoxes with a new, 
general argument that MN ignores attitudes toward risk. 

Let us start with a version of Allais’s paradox roughly like the version 
investigated by Kahneman and Tversky.? Suppose that one is asked to 
compare some options that award cash prizes depending on the occurrence 
of some events. The probabilities one assigns to the events are determined 
by one’s preferences concerning other, independent options.? And one is 
indifferent to the general consequences of the options except for the 
monetary consequences. In particular, there are no significant general 
emotional consequences. First, one compares two options a and b. Option a 
yields $3,000 with certainty. Option b yields $4,000 with a probability of 4/5, 
and $o with a probability of 1/5. It would not be unreasonable to prefer 
option a because one thinks that the greater probability of gain compensates 
for the smaller possible gain. Next, one compares two other options c and d. 


1 Cf., for example, David Bell [1982]. But note that not all regret theorists defend MN. Graham 
Loomes and Robert Sugden [1982], for example, reject MN, principally because they reject 
the axiom of transitivity of preference that is assumed in definitions of probability and utility 
for MN. The expectation principle they do advance is proposed as a descriptive law and is less 
general than either MN or MB. It is applicable to pairs of actions only. 

2 From our point of view, the difference between the ‘certainty effect’ (or ‘common ratio 
effect’) and the ‘common consequence effect’ is not significant. Both are the result of an 
aversion to the risk generated by the dispersion of the utilities of possible gains and losses. 

3 I suppose that these probabilities have been determined by independent preferences in 
accordance with, for example, Savage’s theory. Kahneman and Tversky do not suppose this. 
‘They just assume that the decision maker’s probabilities equal the objective probabilities that 
he is told govern the gambles. However, then the paradox can be resolved by rejecting their 
assumption and introducing a more accommodating assignment of probabilities. Uday 
Karmarker [1979], for example, introduces a set of decision weights that, taken as 
probabilities, would resolve the paradox. 
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Option c yields $3,000 with a probability of 1/4, and $o with a probability of 
3/4. Option d yields $4,000 with a probability of 1/5, and $o with a 
probability of 4/5. Here it would not be unreasonable to prefer option d 
because one thinks that the greater possible gain compensates for the smaller 
probability of gain. 

However, preferring a to b and d to c is contrary to MN. According 
to MN, the first preference imposes the constraint that U($3,000) > 
4/5 x U($4,000). And the second preference imposes the constraint that 
1/4 x U($3,000) < 1/5 x U($4,000), in other words, that U($3,000) < 
4/5 x U($4,000). Since these constraints are incompatible, there are no 
probability and utility assignments according to which the preferences 
maximize expected utility. 

Proponents of MB say that the preferences advanced are reasonable, 
although contrary to MN, because MN ignores differences in risk due to 
differences in the dispersion of the utilities of possible gains. In particular, 
MN ignores the special advantage option a enjoys because it is a sure-thing 
and the dispersion of the utilities of its possible consequences is zero. 
According to MB, the risk due to the dispersion of the utilities of possible 
monetary gains makes the utility of the results of each option given each state 
less than the utility of the monetary gain from the option given the state, 
unless the monetary gain from the option is certain. Hence risk lowers the 
utility of every option-state pair, except where the option yielding $3,000 
with certainty is concerned. As a result, taking risk into account, the 
preferences advanced yield the consistent constraints that for certain 
adjustment factors x, y, and z, U($3,000) > 4/5 x U($4,000)—x, and 
U($3,000) —y < 4/5 x U($4,000) —2. 

Next consider the following version of Ellsberg’s paradox. Suppose that 
one is asked to compare options as above. However this time the 
consequences of the options depend on the colors of balls drawn at random 
from each of two urns. One knows that the first urn contains a mixture of 50 
black balls and 50 red balls. And one knows that the second urn contains a 
mixture of 100 black and red balls, but one is completely ignorant of the ratio 
of black to red. One is indifferent to general consequences besides monetary 
consequences. In particular, there are no significant general emotional 
consequences. First, one compares e, getting $1,000 if a black ball is drawn 
from the 50-50 urn, and f, getting $1,000 if a black ball is drawn from the 
mystery urn. It would not be unreasonable to prefer e because one is more 
sure of what it offers. Next, one compares g, getting $1,000 if a red ball is 
drawn from the 50-50 urn, and h, getting $1,000 if a red ball is drawn from 
the mystery urn. Here it would not be unreasonable to prefer g for similar 
reasons. 

But preferring e to f and g to h is contrary to MN. To see this, let B1 and 
Rt stand, respectively, for getting black and red from the 50-50 urn. And let 
B2 and R2 stand, respectively, for getting black and red from the mystery 
urn. The first preference imposes the constraint that P(B1)U($1,000) > 


430 Paul Weirich 


P(B2)U($1,000), or P(B1) > P(B2). The second preference imposes 
the constraint that P(R1) U($1,000) > P(R2)U($1,000), or P(R1) > P(Ra2). 
And the calculus of probability imposes the constraint that P(B1)+ 
P(Rx) = P(B2)+ B(R2). Since these constraints are incompatible, there are 
no probability and utility assignments according to which the preferences 
maximize expected utility. 

Proponents of MB say the preferences are reasonable, although contrary 
to MN, because MN ignores the risk generated by the absence of evidence 
for the probabilities of black and of red from the mystery urn. According to 
MB, the risk due to the absence of evidence concerning the mystery urn 
makes the utilities of the results of options involving the mystery urn less 
than the utilities of the corresponding results of options involving the 50-50 
urn. Hence the utilities of option-state pairs for the mystery urn are less than 
the utilities of the corresponding option-state pairs for the 50-50 urn. Asa 
result, taking risk into account, the preferences advanced yield the 
constraints that P(B1) > P(B2)—x and that P(R1) > P(R2)—y. And these 
constraints are consistent with the background constraint that P(B1)+ 
P(R1) = P(B2)+ P(R2).3 

In the dispute concerning Allais’s and Ellsberg’s paradoxes, most 
proponents of MN, for example, Savage ([1954], p. 101 ff.) and Raiffa 
([1961] and [1968]), claim that the preferences advanced in the paradoxes 
are irrational. They claim that since MN handles attitudes toward risk 
through assignments of probabilities and utilities, violations of MN cannot 
be defended by appealing to aversion to risk. Moreover, they attempt to 
bolster this claim with various secondary arguments that the preferences 
advanced in the paradoxes are inconsistent and that the axioms of preference 
that entail MN are sound. 

Before discussing the claim that MN handles attitudes toward risk, I will 
briefly treat the secondary arguments. I will not try to review and refute all 
of them. They are too complex and too numerous. Instead I will discuss one 
paradigm and indicate the general error it makes. This is sufficient here 
since the secondary arguments are catalogued, analysed, and criticized 
elsewhere.” Also, our argument that MN does not handle attitudes toward 


1 Isaac Levi [1984] and Frederic Schick ([1984], Chp. 3, Sec. 7) give accounts of Allais’s and 
Elleberg’s paradoxes that appeal to the indeterminacy of probabilities and utilities rather than 
aversion to risk. However the accounts that they give assume controversial decision rules for 
cases involving indeterminate probabilities and utilities. Furthermore, as Levi concedes, 
their approach does not accommodate versions of Allais’s and Ellsberg’s paradoxes involving 
genuine preferences among options, and not just chotces among options. For such versions of 
the paradoxes involve violations of the rule that preferences maximize expected utility 
according to some admissible pair of probability and utility functions. Since by stipulation 
our versions of the paradoxes involve preferences, Levi’s and Schick’s approach does not help 
with them even if it does resolve other versions of the paradoxes. 

2 For a more detailed criticism of the secondary arguments used to support MN, see, for 
example, McClennen [1983] and Sowden [1984]. My criticism of Raiffa’s argument is similar 
in some respects to McClennen’s criticism of that argument. 
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risk shows that none of the secondary arguments in support of MN can 
succeed. 

Raiffa ([1968], pp. 82-85) argues that the preferences advanced in Allais’s 
paradox are inconsistent. One of his arguments concerns conditional 
preferences in a certain situation involving options a and b of Allais’s 
paradox. For our version of the paradox, the relevant situation is 
diagrammed below 


a $3,000 


$4,000 


$0 





First the arguments points out that if a person prefers a to b in Allais’s 
paradox, consistency requires that in the situation diagrammed he prefer a 
to b on condition that the decision node is reached. Next the argument 
claims that a preference for a over b on condition that the decision node 
is reached is equivalent to a preference for a 1/4 chance for $3,000 over a 
1/4 x 4/5, or 1/5, chance for $4,000. Since the latter preference is the same as 


1 According to Jonathan Cohen’s [1981] position, our intuitive judgments in Allais’s and 
Ellsberg’s paradoxes refute MN by themselves. However his position attributes too much 
weight to these intuitive judgments. We have to also take into account relevant theoretical 
considerations. For these considerations might show that the intuitive judgments are 
mistaken. In general, the data concerning violations of MN do not consitutute a conclusive 
case against MN. ‘They simply show that our pre-theoretical intuitions run contrary to the 
tule and therefore suggest that adherence to the rule is the result of a ao ee 





432 Paul Weirich 


a preference for c over din Allais’s paradox, the argument concludes that in 
Allais’s paradox preferences for a over b and for d over c are inconsistent. 

The weak spot in this argument is the claim that a preference for a over b 
on condition that the decision node is reached is equivalent to a preference 
for c over d. The conditional preference involves the assumption that the 
decision node is reached, whereas the nonconditional preference does not. 
As a result, the nonconditional preference is riskier than the conditional 
preference. Given this difference, the conclusion that the preferences 
are equivalent is unwarranted.’ In general, the secondary arguments 
supporting MN in Allais’s and Ellsberg’s paradoxes make the same error as 
MN. They fail to take account of risk. 

Let us now evaluate the principal argument that the preferences in Allais’s 
and Ellsberg’s paradoxes are irrational, namely, the argument that aversion 
to risk cannot justify those preferences since MN already takes aversion to 
risk into account. We will begin by considering the reasons for holding that 
MN takes account of attitudes to risk. Then we will present the case for the 
opposite view. 

As noted in Section 1, proponents of MN argue that probability and 
utility assignments under MN reflect attitudes towards risk. In particular, 
they claim that aversion to risk makes the utility curve for a quantitative 
commodity concave down. One of the most forceful arguments for this view 
is constructed by Michael Rothschild and Joseph Stiglitz [1970].? They 
show that having a concave utility curve entails two phenomena that are 
apparent manifestations of aversion to risk. Both phenomena concern 
options o and o’ that produce amounts of a commodity with various 
probabilities, and that have the same expected value of the amount 
produced. The first phenomenon is roughly that o is preferred to o’ if o' 
produces amounts of the commodity by means of a two-staged lottery in 
which the first stage is equivalent to o. The second phenomenon is roughly 
that o is preferred to o' if the probabilities of extreme amounts of the 
commodity are greater for o’. These patterns of preference do at first glance 
appear to indicate aversion to risk, and it is remarkable that they both follow 
from the concavity of the utility curve for the commodity. However there isa 
weakness in the argument from this entailment to the view that the concavity 
of the utility curve represents aversion to risk. Since the patterns of 


1 The temptation to conclude that the preferences are equivalent arises, I think, from 
inattention to the difference between (1) a conditional preference for a over b and (2) a 
nonconditional preference for precommitment to a rather than precommitment to b. (2) is 
equivalent to a nonconditional preference for c over d in Allais’s paradox. And so if (1) were 
equivalent to (2), (1) would be equivalent to a nonconditional preference for c over din Allais’s 
paradox. But (1) and (2) are not equivalent since a preference for precommitment to a involves 
a risk a conditional preference for a does not involve. 

? Rothschild and Stiglitz show the equivalency of definitions of risk in terms of (1) the 
concavity of utility curves, (2) two-staged lotteries, and (3) the tails of probability 
distributions. This equivalency produces the argument presented in the text. 
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preference concern probability distributions of amounts of the commodity 
rather than probability distributions of utilities of these amounts, they are 
explained by the diminishing marginal utility of the commodity in the way 
that an aversion to actuarially fair gambles is classically explained by the 
diminishing marginal utility of money.! Given this, it is doubtful that the 
concavity of the utility curve represents aversion to risk in our nontechnical 
sense. 

Some proponents of MN respond to this objection by arguing against the 
doctrine of diminishing marginal utility. Keeney and Raiffa ([1976], p. 150), 
for example, assert that the classical doctrine of diminishing marginal utility 
involves an illegitimate, nonprobabilistic concept of utility. This response, 
however, presumes an operationalistic criterion of meaning (not just an 
operationalistic methodology). And operationalistic criteria of meaning 
have been extensively criticized in contemporary philosophy of science. 
Moreover, the techniques of Jeffrey and Bolker described above produce a 
concept of utility that is operational and nonetheless can support the 
principle of diminishing marginal utility. 

Now let us begin constructing our general argument that MN does not 
handle attitudes toward risk. This argument shows that an appeal to 
aversion to risk does not resolve Allais’s and Ellsberg’s paradoxes in an ad 
hoc way. Furthermore, beside supporting criticism of MN through the 
paradoxes, the argument also provides a direct criticism of MN from 
theoretical grounds that are independent of intuitions about rational 
preferences in particular cases. Thus it constitutes an especially compelling 
reason to reject MN. 

As announced at the outset, the argument assumes that MB is correct if 
MN is. This assumption does not beg the question since most proponents of 
MN do not object that MB is false, but rather that it is inapplicable. In any 
case I will support the assumption by defending MB in the next section.” 

In order to present the argument, I will first sketch a short, rough version 
of the argument that conveys the main idea. Afterwards, I will present a 
longer, polished version of the argument. 

The short version of the argument goes as follows. Grant that if MN is 
correct, MB is correct as well. Then if MN is correct, it must produce the 
same recommendations as MB. Therefore, since the possible consequences 
used in MB include the risk particular to an option whereas the possible 
consequences used in MN exclude it, the utilities of the possible 
consequences used in MN must be adjusted for the decision maker’s 
attitude to the risks particular to the options that might produce them. 
However, since the possible consequences used in MN are general, and 
might be produced by many options of varying degrees of risk, there is no 
way to adjust their utilities for the risks particular to all the options that 


1 James Dyer and Rakesh Sarin ([1982], p. 375) makes a similar point. 
2? One neoclassical formulation of MB is supported in more detail in Weirich [1977]. 
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might produce them, unless the decision maker is indifferent to variations in 
risk. But if MN requires indifference to variations in risk, then it does not 
handle attitudes toward risk. 

This problem cannot be avoided by twisting the utility curve for 
consequences in the manner of Milton Friedman and Leonard Savage 
[1948] or Harry Markowitz [1952]. Their methods still allow only one utility 
value per consequence. Nor can the problem be avoided by arguing that 
consequences are always themselves lotteries of some kind so that the 
utilities attached to them reflect attitudes toward risk. This line of argument 
may show that MN handles attitudes toward the risks involved in 
consequences. But it does not show that MN handles attitudes toward the 
risks particular to options. And our argument shows that MN ignores those 
risks. No assumption about the nature of consequences can upset the 
argument. For the argument assumes nothing about consequences in MN 
except that they are general. In particular, it does not exclude the possibility 
that they are lotteries, and so is not affected if in fact they are. 

The brief argument presented in the preceding two paragraphs is on the 
right track, but has some shortcomings. First, it focuses exclusively on the 
utilities used by MN. But adjustments for attitudes to risk can appear in 
the probabilities used by MN as well as the utilities. Second, although risk 
constitutes a reason for adjusting the utility of a consequence in MN, and 
although the risks particular to the options that might produce the 
consequence vary, it is still possible that there is one suitable value for the 
utility of the consequence. There may be some factor besides risk that also 
constitutes a reason for adjusting the utility of the consequence. And this 
factor may provide a reason for adjusting in the opposite direction. If this 
factor varies in the set of options that might produce the consequence, the 
reasons it provides for varying the utility assigned to the consequence may 
precisly counterweight the reasons risk provides. Then, other things being 
equal, no variation in that utility is required. This possibility may seem 
unlikely, but it has not been ruled out, and the issue is too complex to trust to 
intuition. 

In the following longer version of the argument, I will remove these 
shortcomings. First, I will appeal to measurement theory to show that even 
if the probabilities used in MN are adjusted for risk, the utilities used in MN 
must be adjusted as well. Second, I will construct a special case where 
nothing that varies along with risk requires a countervailing adjustment in 
utilities. Then, arguing as above, I will claim that in this special case MN 
requires indifference to risk. I will conclude that since MN requires 
indifference to risk there, it does not handle attitudes to risk. 

Some terminology will be helpful in presenting the argument. Let 
CB(o, s) and CN(o, s) stand respectively for the consequences of an option o 
given a state s as taken in MB and as taken in MN. Let PB and UB, and PN 
and UN stand respectively for probability and utility as interpreted in MB 
and as interpreted in MN. And let R(o) stand for the risk involved in o. 
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For simplicity, put UB and UN on the same scale by letting them have the 
same unit and zero point. In other words, specify UB and UN so that for 
some x and y, UB(x) = UN(x)=1 and UB(y) = UN(y) =o. This is 
possible for two reasons. First, the arguments of UB and UW are, we 
assume, the same kind of object, perhaps propositions or events. And 
second, UB ranges over all results, general as well as particular, so the 
domain of UB includes the domain of UN. 

We will show that if MN is correct, then in the special case to be 
constructed, the decision maker is indifferent to risk. The argument for this 
conditional claim begins with the assumption that MN is correct. Given that 
if MN is correct, then so is MB, it follows that MB is also correct. 

Now consider a decision maker whose preferences are rational and, by our 
assumption that MN is correct, satisfy the axioms of preference assumed by 
a formulation of MN (say, Savage’s). Then according to a “representation” 
theorem of measurement theory (the one Savage proves), given a unit anda 
zero point for utility, there is just one utility function over option-state pairs 
whereby the decision maker’s preferences maximize expected utility. By 
MN, the utility the function assigns to an option-state pair (0, s) is equal to 
. UN(CN{o, s]). On the other hand, by MB, it is equal to UB(CB[o, s]). 
Hence, UN(CN[o, s]) = UB(CB[o, s]) for all o and s. 

Next we construct our special case and show that in it the foregoing entails 
that the decision maker is indifferent to risk. The case we imagine has the 
following features. First, the results of o given s that are included among 
CB(o, s) but not among CN(o, s), except perhaps R(o), are matters of 
indifference. Thus UB(CB[o, s]) depends only on CN(o, s) and R(o). 
Furthermore, UB(CB[o, s]) depends only on UB(CN[o, s]) and UB(R{[o)), 
not on any other characteristics of CN(o, s) or R(o), or the way in which they 
are combined in CB(o, s). As a result, for some function F, UB(CBD[o, s]) = 
FLUB(CN[o, s]), UB(R{o])]. Second, FLUB(CN[o, s]), UB(R[o])] varies 
with UB(R[o]). In other words, F is not constant with respect to the second 
argument. We assume nothing further about F, in particular, we do not 
assume that F is additive. Third and last, there is an option o and a state s 
such that CN(o, s) is a possible consequence of several options of varying 
degrees of risk. These three features of our special case are mild and do not 
entail indifference to risk by themselves. 

Now we show that in the special case just constructed indifference to risk 
follows from the results of our hypothesis that MN is correct. According to 
the conditions of the special case, there is an o and s such that CN(o, s) is a 
possible consequence of several options of varying degrees of risk. Let 
cn stand for this possible consequence. By the equality of UN(CN[o, s]) 
and UB(CB[o, s]) for all o and s, it follows that for any o and s such that 
cn = CN(o, s$), 


UN(cn) = UN(CN[p, s]) = UB(CB[o, s]) 
Also, by the dependency condition, for any o and s such that cn = CN(o, s), 
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UB(CBlo, s]) = FLUB(CN[o, s]), UB(R[o))] = FLUB(en), UB(R[0]})]. 
Hence for any o such that for some s, cn = CN(o, s), 
UN(cn) = F[UB(cn), UB(R[o])]. 


This implies that F[UB(cn), UB(R[o])] is the same for any such o. But since 
by hypothesis F is sensitive to changes in UB(R[o}), this in turn implies that 
UB(R[o]) is the same for any such o. On the other hand, given our 
hypothesis that cn is a possible consequence of several options of varying 
degrees of risk, R(o) is not the same for any such o. Therefore UB must be 
insensitive to changes in R(o). In other words, the decision maker must be 
indifferent to risk. 

We have shown that MN is correct in the special case only if decision 
makers are indifferent to risk. This shows that MN does not handle attitudes 
to risk in general. And given the mildness of the conditions of the special 
case, it also shows that MN does not handle attitudes to risk in any 
interesting class of cases. 


4 THE CORRECTNESS OF MB 


Intuitively, it seems that the best way to make MEU respond to attitudes 
toward risk is to interpret consequences broadly so that the risk involved in, .. 
an option counts as a consequence of the option. That is, the best course 
seems to be to interpret the consequences of o given s so that they encompass 
all the results of o given s, including the risk run by realising o.! In the 
remainder of the paper I will argue for the version of MEU that adopts this 
interpretation of consequences, viz., MB. In other words, I will argue for 
defining the expected utility of o as LPB(s,) UB(CB((o, s;]).? 

Besides the intuitive attractiveness of letting the utility for an option-state 
pair attend to all the results of the option given the state, the main positive 
reason for MB is that it resolves Allais’s and Ellsberg’s paradoxes. By 
making consequences include risk, it makes expected utilities sensitive to 
the risks that are the source of trouble in these paradoxes, and so brings 
MEU into agreement with the preferences advanced in them. The rest of the 
case for MB consists of responses to various objections. In this section I 


1 In the face of the problems with attitudes toward risk, Jagdish Handa ([1977], p. 115 ff.) and 
Nils-Erick Sahlin and Peter Gardenfors [1982] propose abandoning MEU for other decision 
rules. But there is no motivation for the rules they propose if there is a version of MEU that 
handles the problems. 

Here, of course, we are defending MEU as a normative rule. It may well be that no 
formulation of MEU is adequate as a descriptive law. Kahneman and Tversky [1979] and 
Machina [1982], for example, may provide descriptive accounts of decision making that are 
more satisfactory than MEU. 

2 ‘This proposal can be put into a simpler, equivalent form. UB(CB[o, s]) can be reduced to the 
utility of o itself givens, or UB(o, s), since UB(CB[o, s]) and UB(o, s) are equivalent assuming 
the standard idealization for MEU. (See Weirich ({1982], p. 74).) Hence EPB(s) UB(CBfo, 
s;]) can be reduced to 2PB(s) UB(o, si). 
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consider an objection concerning the correctness of MB and in the next 
section an objection concerning its methodological adequacy. 

Some may object that MB is incorrect because it double counts risk. They 
may contend that the very method of calculating the expected utility of an 
option at least partly takes into account the risk involved in the option. 
Hence if the interpretation of consequences is broadened to include that 
risk, as in MB, the risk involved in the option is counted twice. 

We can refute this objection, however, by refuting the premiss. Plainly, 
the method of calculating the expected utility of an option does not even 
partly register the utility of the risk involved in the option, since the method 
is the same for each person regardless of the person’s attitude toward risk. 

The only way to defend the objection’s premiss is to claim that there is just 
one rational attitude toward risk and the method of calculating expected 
utilities expresses it. However, neither side of the dispute about expected 
utility and risk is willing to make this claim. Both sides assume that rational 
decision makers may have different attitudes toward risk, and that decision 
rules should provide for such differences. Therefore, at least from the 
standpoint of this dispute, the objection about double counting risk fails. 

Perhaps those who object that MB double counts risk would say that I 
have misrepresented their argument. It need not claim that the method of 
calculating the expected utility of an option at least partly registers the utility 
of the risk involved in the option. It might claim that since the risk involved 
in an option depends upon the possible consequences of the option, one 
must derive the risk involved in an option from its possible consequences 
before one can include that risk among them. But if, after deriving that risk 
from the possible consequences, one includes it among them, one has double 
counted it.} 

This argument is also unconvincing. True, the risk involved in an option 
depends upon its possible consequences. But this does not make it 
impossible to count the risk among those possible consequences. The risk 
can be included as long as its inclusion produces an equilibrium between the 
effect of the possible consequences on the risk and the effect of the risk on the 
possible consequences. And there is no reason to doubt that such an 
equilibrium is generally produced. 

Our appeal to the existence of equilibria between risk and possible 
consequences is supported by our ability to assign utilities to consequences 
taken broadly. However one is understandably curious about the nature of 
such equilibria. What are the details? A full answer would contain two parts. 
The first part would provide a general equation for UB(CB[o, s;]) in terms of 
UB(R[o]) and other factors, and a general equation for UB(R[oe]) in terms of 
UB(CB[o, s;]) for i = 1, 2,...,. Then the second part would demonstrate 
that those equations have simultaneous solutions under suitable conditions. 


1 This line of argument is suggested in a remark by Robin Pope ([1984], p. 262) on the utility of 
gambling and the separability of probabilities and utilities in MEU. 
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Unfortunately, I am not yet able to supply this sort of answer. But I can at 
least illustrate the nature of the equilibria reached. 

Consider the following equations for the utilities of consequences and the 
utility of risk. 


For all 1, UB(CB[o, s,]) = UB(CN[o, s})+ UB(R[e]); and 
for some constants k, UB(R[o]) = k x Var[UB(CBf[o, s,])]. 


As Paul Samuelson ([1983], p. 522 f.) points out, these equations do not hold 
in general. But imagine some special case in which they do hold. Then in 
that case consider the plausible condition that UB(R[o]) =kx 
Var[UB(CN[p, s;])]. Given this condition, the equation for the utilities of 
consequences entails the equation for the utility of risk, since the variance of 
a random quantity is not altered by adding a constant to the quantity. Thus 
if we suppose that the condition holds, the displayed equations are 
simultaneously solved, and in our special case we obtain an equilibrium 
between risk and possible consequences. 


5 THE SERVICEABILITY OF MB 


MB plainly applies in ideal cases where the utilities for option-states pairs 
are known. So it has some interest from a theoretical point of view. 
However, critics may charge that MB has no interest from a practical point 
of view. They may complain that under MB the utilities for option-state 
pairs are hard to assess because some of the factors they encompass, such as 
attitudes toward risk, follow no regular pattern. And therefore they may 
claim that assessing expected utilities as defined in MB is as hard as 
comparing options directly, without the help of expected utilities, so that 
MB is disappointing as an aid for real-life decisions. This, I think, is the 
main objection to MB. 

To reply to this objection, I will show that MB can help solve realistic 
decision problems. My illustrations will be more cogent the less they rely on 
controversial principles for evaluating risk. Hence I will not appeal to 
principles at the cutting edge of research on risk. Rather I will use 
commonplace principles as simple and convincing as possible and yet 
adequate for applying MB. Friends of MB can pursue the references for 
bolder, more sophisticated means of applying MB. 

My illustrations will take advantage of the fact that comparisons of 
expected utilities are sufficient for finding options of maximal expected 
utility; they will not specify expected utilities quantitatively. However, for 
simplicity of exposition, they will assume the existence of the probabilities 


1 For some interesting proposals for evaluating risk, see for example, Harry Markowitz ([1959], 
Chp. XIII), Peter Fishburn [1977], and Ole Hagen [1979]. And for a general review of 
proposals for evaluating risk, see Robert Libby and Peter Fishburn [1977], Paul Schoemaker 
([1980], Chp. 3), and John Payne [1982]. 
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and utilities needed for the existence of expected utilities. This assumption 
can be dispensed with by recasting the illustrations so that instead of 
applying MB directly, they apply the subsidiary principle that when beliefs 
and desires are not quantitative, preferences among options should at least 
accord with MB under some assignments of probabilities and utilities that 
are consistent with beliefs and desires. 

To begin, we restrict ourselves to cases where XPB(s,)UB(CB[o, s,]) can 
be separated into the expected utility of the general consequences of o and 
the utility of the risk involved in 9, i.e., PB(s;) UB(CN{o, s;]) and UB(R[o)). 
More specifically, we assume that there is some function F such that 


EPB(s)UB(CB[o, s)) = FLZPB(s) UB(CN{o, s), UB(R[o})]. 


As a further restriction, we assume that F increases as its arguments 
increase. These restrictions are not met in all cases. But they are met, for 
example, in standard gambling cases. In those cases it is assumed that the 
gamblers care only about monetary gains and losses and risks, and do not 
have, say, a moral aversion to gambling. And it is assumed that the risk taken 
in obtaining some monetary consequence has no influence on the utility of 
the consequence, but is rather so much water under the bridge.! 

Since in the special cases we are considering MB evaluates an option in 
terms of the mean utility of its possible general consequences and the utility 
of the risk it involves, MB constitutes a so-called ‘mean-risk’ decision rule.” 

The next step is to find some sets of options over which £PR(s) UB(CN[o, 
si]) and OB(R[o]) increase or decrease together so that the resultant increases 
or decreases in 2PB(s, UB(CB{o, sil) give us a means of comparing options. 
The literature of the mean-risk school suggests three cases especially 
strongly. As in the usual applications of MN, these cases assume that the 
decision makers are typical and in typical circumstances. In particular, they 
assume that the decision makers are averse to risk, that the relevant general 
consequences of gambles are monetary gains and losses, and that the utility 
curve for wealth is concave down. 


1 I think it would be more intuitive to separate LPB(s,) UB(CB[o, s;]) into the expected utility of 
the causal effects of o and the utility of the risk involved in realizing o. But I will not pursue 
this refinement here. 

2 The leading idea of the mean-risk school is to evaluate an option in terms of the mean value of 
possible gains and losses and the risk involved. So claiming that in certain cases the value of o 
equals F[LPB(s, UB(CB[o, s,]), UB(R[o]})] puts us in the mean-risk school. But there are 
several kinds of mean-riek formulas from which ours should be distinguished. First, many 
mean-risk formulas are proposed as descriptive laws whereas ours is proposed as a normative 
rule. Second, mean-risk rules are often proposed as rivals to MEU, whereas our agrees with 
MB. Third, some of the formulas use the mean value of possible monetary gains and losses 
whereas ours uses the mean value of possible utility gains and losses. Fourth, some specify 
precisely the means of combining the mean and the risk terms, whereas our specifies only that 
the means of combining is some monotonic function. And fifth, some have different 
restrictions. 

Markowitz ([1959], Chp. XIII) gives a classic presentation of some mean-risk decision 
rules. 
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First, consider a sequence of gambles with increasing stakes at fair 
odds on the same uncertain event. As o ranges over these gambles, 
XPB(s,)UB(CN[o, s;]) decreases because the utility curve for wealth is 
concave. Also, UB(R[o]) (a negative quantity) decreases (increases in 
absolute value) because the dispersion of the utilities of possible general 
consequences increases, enlarging the risk involved. Hence, as the stakes 
increase, UPB(s,)UB(CB[o, s;]) decreases. Therefore, if one has to choose 
among such gambles, choosing the gamble with the smallest stakes 
maximizes expected utility. 

Second, suppose a person confronts a sequence of gambles throughout a 
period when his wealth is increasing. All the gambles offer the same 
probability of losing the same sum of money, and offer no possibility of gain 
or other loss. As o ranges over the gambles, 2PB(s)UB(CN[o, s,]) (a 
negative quantity) increases (decreases in absolute value) because the utility 
curve for wealth is concave. Also, UB(R[o]) increases (decreases in absolute 
value) because risks are less significant the better off one is. Hence, as the 
person’s wealth increases, XPB(s;)UB(CB[o, s,]) (a negative quantity) 
increases (decreases in absolute value). Therefore, if a person once decided 
to accept a 10% probability of losing $1,000 in order to avoid some 
unpleasant chore, and now is wealthier and faced with the same choice, then, 
other things being equal, consistency requires that he accept the gamble 
again. 

Third, consider a sequence of gambles where the stakes and the 
probabilities of winning are constant, but where the weight of the evidence 
for the probabilities of winning increases. For vividness, we can suppose 
that (a) the gambles are coin tosses involving different coins (b) each coin 
toss pays $100 if heads comes up, (c) for each coin toss, as determined by a 
statistical test of the coin, the probability of heads is 0.5 and (d) the sample 
size of the statistical test increases from one coin toss to the next. Here as o 
ranges over the gambles, 2PB(s,)UB(CN[{o, s;]) is constant, but UB(R[o]) 
increases (decreases in absolute value) because probabilities become firmer. 
Hence as the weight of the evidence increases, LPB(s;)UB(CBl[o, s;]) 
increases. Therefore, if one has to choose among such gambles, choosing the 
one for which the weight of evidence is greatest maximizes expected utility. 

The foregoing shows how MB can be applied. Although the illustrations 
are more constrained than familiar applications of MN, they are just as 
realistic. Thus the charge that MB lacks practical interest is unwarranted.* 


6 SUMMARY 
We used Allais’s and Ellsberg’s paradoxes and a new, general argument to 
show that one version of MEU, MN, ignores attitudes toward risk, whereas 


! For an application of mean-risk methods of evaluation to the St. Petersburg gamble, see 
Weirich [1984]. 
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another version of the rule, MB, takes them into account. MB accomplishes 
this by counting the risk involved in an option among the consequences of 
the option. According to it, the utility for an option-state pair is the utility of 
all the results of the option given the state. We defended MB against some 
objections, in particular, the objection that it is inapplicable. 


The University of Rochester 
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I INTRODUCTION 


Recent trends in philosophy of science suggest that attempts are being made 
to establish some sort of compromise between the ‘philosophical’ 
perspective on the one hand and the ‘sociological’ perspective on the other. 
Examples of this are Newton Smith [1981], which leans the former way, and 
Hesse [1980], which leans the latter way. 

In the literature of a few years ago, it was usually assumed that, for 
instance, Lakatos and Laudan were firmly in the ‘philosophical’ camp and 
Bloor and Barnes were equally firmly in the ‘sociological’ camp. I have no 
particular quarrel with those assumptions. (Also the authors would have 
accepted them.) I am, however, much more doubtful about a further 
assumption; namely, that Kuhn was to be placed clearly in the ‘sociological’ 
area. 

With the publication of Barnes’ recent book [1982], I suspect that this 
interpretation is unlikely to change. However, since I hold the view that the 
interpretation is quite wrong, I feel a certain obligation to say so. This will 
involve looking again at the relevant texts and the (mostly contemporary) 
response to them. It is, I think, failure to take seriously Kuhn’s ‘second 


1 This distinction should become clearer later in the paper. Roughly, according to the 
philosophical position, it is the task of philosophers of science to find and specify generally 
acceptable criteria of rationality which would explain at any rate a very great deal of past 
science in ‘rational’ terms. According to the sociological position, science and, in particular, 
past science is to be analysed in socio-cultural terms. 


GG 
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thoughts’, his reaction to his philosophical critics, which has led to the 
present impasse.! 


2 WHY IS KUHN’S ANALYSIS CONSIDERED SOCIOLOGICAL? 


The arguments for attributing a sociological position to Kuhn take many 
different forms and have varying degrees of support from the text. However, 
the following broad summary is, I think, adequate for present purposes: 


(a) Kuhn is drawing on the results of sociological research. 

(b) He is concerned with the significance of concepts and theories to the 
participants who were involved in historical events. 

(c) The notion of a ‘paradigm’ is such that it includes purely and overtly 
social factors. (The prevalence of paradigm—governed ‘normal’ 
science is thus to be accounted for in sociological terms.) © 

(d) Different paradigms are ‘incommensurable’ hence the transition 
from one to another can be accounted for only in social or 
psychological terms; in a sense, successive paradigms cannot be 
compared. 

(e) The whole notion of a ‘methodology’, in the philosopher’s sense, is 
misconceived. Hence associated ideas: ‘rational reconstruction’, 
‘internal history’, ‘rational history’ and so on are, to put it mildly, 
suspect. These points overlap to a considerable extent but I shall 
attempt to discuss each in turn while avoiding too many tedious cross- 
references. 


3 THE FIRST FOUR ARGUMENTS 
(a) The Role of Technical Sociology 


On the face of it, the claim that Kuhn is drawing on the results of the 
sociologists is an odd one. He is quite explicit that his work does not rely on 
received sociological or psychological theories. Rather, he says ([19708], 
p. 235), he relies on common-sense generalizations about the behaviour of 
scientific communities (not excluding observations collected by, among 
others, sociologists). What has perhaps misled some commentators is that 
Kuhn himself not infrequently uses the word ‘sociological’. 

The question arises as to how the scientific communities just mentioned 
are isolated; Kuhn cites certain content-free sociological papers in this 


1 There is a discussion of Kuhn’s later views in Musgrave [1971]. Examples of the sociological 
approach which largely omits such considerations are numerous. In fact, Musgrave seems to 
have written in vain. 

2 For instance [19705], pp. 237-38. Kuhn links ‘sociological’ analysis with concentration on 
scientific groups. (The argument is repeated in his [1977], p. xx). However, he describes the 
associated principles as ‘irreducibly sociological, at least at this time’. (My emphasis.) Thus he 
appears to imply that, in any case, the days of this approach may be numbered. 
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context which helps give the impression that he is using the results. In fact, 
however, he cannot be: the groups isolated by these methods have a 
membership which is to a high degree indeterminate and whose members, 
insofar as they can be listed, can be said neither to have clearly specifiable 
professional interests nor to share certain basic research-guiding features of 
their work, what Kuhn calls a ‘paradigm’, in any straightforward sense.’ 

While it must be admitted that these results do not destroy Kuhn’s views, 
they do not provide much support either. It seems best, if it is not being too 
cynical, to regard the material as irrelevant.? Musgrave is right when he 
concludes ‘It seems that Kuhn’s . . . sociological citations should be taken 
with a pinch of salt’ ([1971], p. 288). The argument relating to these 
sociologists is a damp squib. 


(b) The Role of the Participants 


According to Barnes, who puts forward the case for regarding Kuhn as a 
sociologist, the basic argument is that Kuhn is explicitly concerned ‘to 
elucidate the significance of concepts and theores in the actors who 
propounded them’. Barnes adds that Kuhn escapes some of the difficulties 
that usually confront idealist positions (which stress the actors’ frame of 
reference) by introducing such notions as ‘crisis’ in order to account for the 
modification of the actors’ attitudes. 
In order to support his fundamental point, Barnes quotes Kuhn: 


Rather than seeking the permanent contributions of an older science to our presennt 
vantage, (historians) attempt to display the historical integrity of that science in its 
own time. They ask, for example, not about the relation of Galileo’s views to those of 
modern science, but rather about the relationship between his views and those of his 
group, i.e., his teachers, contemporaries, and immediate successors in the sciences.° 


This quotation is however somewhat out of context. Kuhn is there 
arguing against inductivism and almost anyone who is doing that will stress 
the inadequacy of relying on that which appears most relevant from the 
point of view of modern science. More generally, Kuhn can be interpreted as 
making a plea for greater emphasis to be placed on detailed historical 
arguments and interpretations rather than on global philosophies. However 
I simply do not see why, especially in view of his lack of reliance upon 
received theories and results, this turns Kuhn into a sociologist. It seems to 
me to be a necessary but not a sufficient condition. 


1 This emerges from an analysis of the material cited in Kuhn [1962], 176ff. 

? The interesting question remains, however: how are the communities isolated? Presumably 
we have to use ‘informal methods’ (what we already know of the history of science, 
independently of paradigms in order to avoid circularity) and make judgements accordingly. 
Such a procedure (unlike the sociological) will clearly not be content free. The latter is now, in 
any case, out of fashion. 

3 Barnes [1972]. The quotation is from Kuhn [1962], p. 3. Barnes says much the same thing in 
[1982], pp. 4-5. 


446 Keith Jones 
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It is true that Kuhn does apparently allow a wide-ranging group of 
components for his disciplinary matrix: that which is shared by the 
members of a community by way of beliefs, values and techniques.’ These 
can presumably include purely and overtly social factors which would not 
normally be considered relevant to science. 

However Kuhn does himself put forward what amounts to a criterion of 
relevance: astrology, for instance, was not a science because astrologers ‘had 
no puzzles to solve and therefore no science to practice’ ({1970a], p. 9). Now, 
the puzzle-solving aspect of scientific activity occurs under the auspices of 
the exemplars, the ‘concrete puzzle-solutions . . . employed as models or 
examples’ which figure in scientific education and are regarded as 
‘philosophically ... deeper’ than the disciplinary matrix with which they are 
contrasted? Hence the special status of the exemplars diminishes the 
importance of the other components of the disciplinary matrix. This is more 
evidence which casts considerable doubt on the conclusions reached by 
Barnes and others as a result of their interpretations of the text. 


(d) Incommensurability 


As far as incommensurability is concerned, Kuhn makes it clear that he 
subscribes only to a weak form of the thesis. He writes: 


What the participants in a communication breakdown can do is recognize each other 
as members of different language communities and then become translators . . . 
(This) translation . . . allows the participants . . . to experience vicariously something 
of the merits and defects of each other’s ‘points of view’ ([1962], p. 202). 


I do not think that this represents much of a threat to those, like Lakatos, 
who dismiss incommensurability arguments. 

There is, I think, in any case something extremely odd about the 
incommensurability thesis in that if it is true that the participants in the 
relevant scientific communities did not (or, according to the strong form of 
the thesis, could not) understand each other during a dispute about whether 
one paradigm should succeed another, how much more difficult is 
understanding going to be for the historian. Even a weak form of the 
argument would seem to render his task almost impossible, given additional 
difficulties of the lapse of time and so on. Yet this is a conclusion that would 
obviously be unacceptable to practising historians, including presumably 
Kuhn himself.’ 


1 Kuhn [1962], p. 175. He does use the term ‘sociological’ himself here. 

2 Kuhn [1962], p. 175. (The term ‘paradigm’ covers both disciplinary matrix and exemplar.) 
Barnes’ concentration on ‘exemplar’ notwithstanding Cf. [1982], XIV. 

3 A similar kind of point is made and discussed at length in Kitchener [1978]. - 
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4 THE FIFTH ARGUMENT: THE NOTION OF A 
METHODOLOGY 


(a) What ts a Methodology? 


I take the idea of a ‘methodology’, in the philosopher’s sense, to be 
essentially a theory about scientific method which often, for instance, also 
serves to demarcate science from non-science or pseudo-science. The best 
known among the older methodologies are inductivism (scientific theories 
emerge from observable facts) and conventionalism (theories are 
conventions which are adopted by the scientist). It seemed likely that in the 
early 1960’s, these views having been large discredited, a new methodology 
would emerge based upon the work of Karl Popper. However, only one 
major historiographical work in this mould appeared and that was more 
negative than positive; that is, more opposed to the older methodologies 
than in favour of the new. The item in question was Agassi’s Towards an 
Historiography of Science [1963].1 His polemical work did not start a new 
tradition in historiography. It is only since 1968 or so that a new 
methodology has emerged, allegedly based on the work of Popper, which 
will bear the weight of historical examples. This is the ‘sophisticated 
methodological falsificationism’ of Lakatos which employs the notion of 
a ‘research programme’.* For the purposes of this paper, this—by now 
fairly ancient development—is the most important because it led to a 
confrontation with Kuhn. 


(b) What do Methodologies Do? 


According to Lakatos ‘philosophy of science provides . . . methodologies 
in terms of which the historian reconstructs “internal history”’ ({1971], 
p. 91). This ‘internal history’ or ‘rational history’ or ‘rationally recon- 
structed history’ will differ with each methodology, or combination 
of methodologies,? and will include such things as Newton’s Laws, 
Schrédinger’s Equation and Lavoisier’s Experiments and exclude such 
things as the political situation, religion and economics. These latter 
subjects are relegated to ‘external history’. 

Now ‘internal history’ as Lakatos writes it, has some curious char- 
acteristics. It need not be, and usually is not, true! Lakatos adopts the 


1 L, Pearce- Williams was influenced by Popper’s ideas when writing his biography of Faraday. 
See his [1968]. ) 

2 Lakatos [1970]. To what extent this methodology is based on the work of Popper is a 
controversial matter. See Popper [1974]. 

3 A mixture of methodologies may be, and often is, used. In what follows, I shall assume that 
this is taken into account. In fact, Lakatos’s own approach ‘blends several different traditions’ 

_ as he says, [1970], p. 122. 
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curious device of ‘correcting’ it in footnotes. Also, and presumably as a 
consequence, internal history is not ‘real’ history (unless one takes a 
‘Platonic’ view of what is real).? It would be very odd, if it were, rather like 
the opinion poll being right and the actual election result wrong. 

These characteristics may be taken as evidence for the sociological view 
that the whole notion is misconceived, that a satisfactory methodology has 
not been and perhaps cannot be found (and that an alternative approach is 
necessary). To the sociologist, attempts to make objective distinctions 
between rational and irrational beliefs and practices are of no interest.* 


(c) Is Kuhn a Methodologist? 


Does Kuhn subscribe to this latter view? To be sure, he apparently rejects 
internal history. He is particularly bothered by the idea that someone who 
takes the notion seriously will have to include in the narrative what he knows 
to be false; no historian could bear to do that, he says ([19708], p. 256, note 
1). I am not sure whether this is a real problem or whether suitably modified 
typographical arrangements would dispose of it. However, the main 
question is whether Kuhn is proposing a methodology in the same sense as 
philosophers have done. Unfortunately the text points in different 
directions: on the one hand ‘I am no less concerned with rational 
reconstruction, with the discovery of essentials, than are philosophers of 
science’ and on the other ‘I began as an historian . . . examining closely the 
facts of scientific life’ ([1970b], p. 236). However, this is in the context of 
arguing against Lakatos. Looked at from a wider perspective, what stand out 
most clearly are not the differences but the similarities between him and 
Lakatos. Thus, Kuhn’s ‘paradigms’, ‘normal science’ and ‘crisis’ are very 
close indeed to Lakatos’ ‘hard core’, ‘work in the protective belt’ and 
‘degenerative phase’.* The alleged differences between their positions rest 
heavily on the attribution to Kuhn of certain sociological views of the type 
which I have already argued against. 

The only issue which seems to remain between them is Kuhn’s alleged 
exaggeration of the degree of consensus which usually occurs during a 
period of ‘normal’ research in a given scientific community. It has been 
suggested by Musgrave that this difficulty disappears if one allows 
competition between (possibly very small) groups hence allowing for the 
disagreement which Kuhn’s critics want to emphasize. Unfortunately it is 


‘For example, [1970], pp, 138, 53 respectively. He indicates where the ‘rational 
reconstruction’ differs from ‘actual history’. 

? Unfortunately Lakatos sometimes seems to take such a view, or one like it. See, for example, 
[1971], p. 106 text to note 61; p. 18, text to note 61; p. 119, text to note 2, respectively. 

3 Barnes, [1972], p. 374, whom I cite as typical. Newton-Smith notes that Barnes et al. might as 
well try to bribe us into agreeing with them, [1981], p. 249. 

* Kuhn himself has noted this: [19708], p. 256. 
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not clear whether Kuhn himself would allow such competition. Moreover in 
his later writings Kuhn allows some disagreements between members of the 
same community, which seems to leave things in a confused state. The 
matter remains controversial.! 

However, I am not at all sure that this is such a vital point. In practice, the 
degree of consensus prevailing will probably often be clear. Kuhn himself 
raises the case of the community of early nineteenth-century chemists 
([1962], p. 180). Their disputes over the existence of atoms led to such 
phenomena as Dalton’s refusal to accept Gay-Lussac’s law. This apparently 
serious breach of even a weak principle of consensus is perhaps accounted 
for by the unusual nature of Dalton’s theory: it did not predict it was purely 
explanatory and there was, as is well-known, an element of dogmatism about 
it. One took it or left it and got on with one’s research. In general, theories 
proliferate when experiment plays a relatively minor part.* For example, in 
the late nineteenth and twentieth centuries there were many kinds of ether 
and J. J. Thomson pointed out that the same could be said of space.* One 
would not expect controversy with this type of science to cease even when 
the associated research is ‘normal’. My conclusion is that, if this kind of 
point is borne in mind, the issue loses a lot of its significance. Hence, if 
Lakatos is putting forward a methodology, as he surely is, then presumably 
Kuhn can be said to be doing so too. 

Can history written according to Kuhn’s ‘methodological’ proposals, 
however, really be described as ‘internal’ or ‘rational’? The presence, in the 
analysis, of any kinds of social factors would seem to suggest not and this, it 
might be argued, is the residual distinction between Kuhn and the 
philosophers which it is impossible to get round. 

I propose to argue, however, that at least certain kinds of social factors can 
legitimately be incorporated within the ‘internal’ area. But before doing this 
some terminological classification is necessary. Let us use the term ‘rational 
history’ to denote that which is reconstructed in accordance with a 
methodology of the type put forward by Lakatos and other philosophers of 
science. We can then employ ‘internal history’ to refer to the broader area 
which is not external in the traditional (albeit fuzzy) sense, where outside 
political interference, for example, would be external. [Whether one retains 
the term ‘methodology’ for that which guides the writing of internal history 


1 See Musgrave [1971], especially pp. 289-92. In a later article, Musgrave notes a second 
difference: ‘ provides objective criteria for appraising competing research 
programmes ... Kuhn gives no such criteria, and once suggested, via the “incommen- 
surability thesis”, that none could be given’ ([1976], pp. 482-3). Insofar as this distinction 
hinges on incommensurability, it is of course covered by earlier arguments. Insofar as it 
hinges on a detailed discussion of Lakatos’s methodology, it is beyond the scope of this 
paper; suffice it to say that his position is problematic. Ibid., especially Section 3. 

2 It is these areas of science upon which philosophers of science have often concentrated and in 
relation to which their criticisms of Kuhn appear most cogent. 

3 Thomson [1936], p. 432. He mentions expanding, contracting and vibrating universes. 
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is, I suspect, a matter of taste; I see no objection to retaining it provided the 
sense is clear.] 

Now before going into details, it is worth noting that Kuhn explicitly 
endorses the view that science is a rational activity in spite of his reservations 
concerning older views of the place of rationality in scientific change.’ In 
accordance with our revised terminology, the point would be better 
expressed by adding that factors can also be involved which are non-rational 
and yet internal.* 

A case which illustrates my point here is the account given by Goldberg 
concerning the British response to Einstein’s Special Theory of Relativity 
[1970]. The neglect of Einstein’s theory and a concomitant insistence on 
retaining the concept of the ether dominated British physics at this time. 
Goldberg brings out a feature which he feels was ‘instrumental’ in 
understanding this phenomenon: the structure of British education in 
physics. ‘A British theoretical physicist . . . was trained to do ether 
mechanics; it is what he had to learn, and it is what he knew best’. This 
assertion is then backed up by quoting from examination papers in the 
Cambridge Mathematical Tripos. What we have here is a perfect example of 
the role of Kuhn’s exemplars. But what is even more important is that, if my 
earlier thesis about the special status of exemplars is right, this kind of factor 
can be incorporated internally while the inclusion of wider social factors 
remains questionable. Goldberg does no more than touch on these wider 
factors: “There are, no doubt, many features of British culture and society 
that could aid in an understanding (of the phenomenon)’—but he wisely 
refrains from telling us what they are. Returning to the exemplars, I suspect 
that an account in terms of them will often be appropriate where national 
differences are invoked, hitherto regarded as territory firmly in the hands of 
the externalists. 

Of course, it is to some extent arbitrary what one calls ‘internal’ and 
‘external’ since such philsophical distinctions are notoriously never clear- 
cut. However, I take it that no account which tells us what are (at least 
allegedly) general features of scientific method, could hope to capture, say, 
printers’ errors, lost documents, religious intolerance and that such things 
are safely placed in the external area. On the other hand, such an account can 
capture the role of scientific training and that is what Kuhn has done. 
Becoming educated is internal to the professional activity, even if it is, in 
itself, non-rational. 


1 A defense of irrationality in science seems to me not only absurd but vaguely obscene’ 
({19705], p. 264). The issue arises in the context of the nature of the transition from one 
paradigm to another. 

2 Distinctions between ‘rational’ and ‘internal’, although mentioned by Kuhn himself ([1971], 
p. 140-1) are often overlooked. See, for exanple, Newton-Smith [1977]. Curiously, Kuhn 
makes little use of his own distinctions in his relatively recent [1980]. 
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5 KUHN AND THE SOCIOLOGICAL PERSPECTIVES. 


If it is correct that Kuhnian history is best classified as ‘internal’, a 
convenient wedge could be driven between him and the sociologists should 
it turn out that their position is ‘external’. 

There appear to be two basic ways of putting the sociological perspective. 
Roughly: 


(a) Science has been influenced, in a causal manner, by non-scientific 
considerations. 

(b) Scientific beliefs are embedded in their historical context and should 
be elucidated in terms of their meanings to the actors who 
propounded them. 


The precise import of neither position seems to me to be very clear but at 
least (a) seems to fit into the external non-rational category. In the case of 
(b), I suspect, as often interpreted it is so weak as to be open to the objection 
that it is not really characteristically ‘sociological’ at all.* 

Hence, if worth its salt, the sociological approach to methodological 
questions is distinguishable from the Kuhnian—at least as distinguishable 
as the internal-external distinction upon which it rests. 


6 CONCLUSION 


What I have been trying to do in this paper is to correct what is by now a 
deeply entrenched view: to show that the sociological approach to 
methodological questions does not receive much support from Kuhn’s 
views and certainly nothing like as much support as sociologists are wont to 
claim. The real importance of Kuhn’s work lies in his drawing attention to 
certain universal features of scientific method—with striking originality. 


University of Kent at Canterbury 
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Discussions 


A PROBLEM FOR RELATIVE INFORMATION MINIMIZERS, 
CONTINUED 


1. In a previous note (van Fraassen [1981]) one of us presented a problem 
for the relative information rule in probability kinematics. This sequel will 
be made self-contained by presenting both the rule (INFOMIN) and the 
problem (Judy Benjamin problem) again. We will argue that the problem 
does not admit of a unique solution. Comparing three specific such solutions 
by two further criteria (quasi-empirical tests of rule performance), we find 
that INFOMIN is not the best on either count. 


2. The problem of probability kinematics is this: given a prior probability 
function, and some constraint on what the posterior probabilities should be, 
how is one to choose that posterior? If the constraint is to assign 1 to a given 
proposition E, the standard answer (called by Ian Hacking the Bayesian 
Dynamic Assumption) is that given prior P, the posterior should be P’(—) 
= P(—|E) = P(—&E)/P(E)—the rule of Simple Conditionalization. In 
1965, Richard Jeffrey proposed a rule for the further sort of constraint, to 
assign a number o <x <1 to E: transform P into P’(—) = xP(—JE)+(1 
—x)P(—|E)—the rule now called Jeffrey Conditionalization. In general, a 
rule R in probability kinematics applies to a certain set of constraints S(R) 
each of them pertaining to a suitable set of possible prior probability 
functions, and prescribes the posterior for each constraint in S(R), for each 
suitable prior. (In the two previous examples, P would not have been 
suitable if it had assigned zero to E.) 


3. The rule we call INFOMIN is applicable to constraints (imposed singly 
or in combination) of the form a,P(A;)+ ... +a,P(A,) = r where Ay,..., 
A, form a logical partition (expectation value constraints). It has been 
presented sympathetically in various philosophical publications (e.g. 
Williams 1980, van Fraassen 1980). Deriving from a definition of 
information (negative entropy) in statistical mechanics, it is backed by a 
good deal of technical literature. This literature contains also a priori 
deductions of the rule from certain principles, which are presented as 
demands of consistency. (For recent examples see Shore and Johnson; 
Skilling; Tikochinsky et al.) 

In evaluating these justifications, however, one must carefully distinguish 
desiderata from more compelling considerations. One desideratum might be 
that the operations in probability kinematics commute, t.e., that the order in 
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which successive constraints are imposed makes no relevant difference. 
This very nice feature characterizes Simple Conditionalization but neither 
Jeffrey Conditionalization nor INFOMIN. Another desideratum might be 
that the rule should apply to—or be extendable to a rule which applies to—a 
large class of constraints. But how large? The INFOMIN rule, for instance, 
does not apply to the constraint that E and E’ should be stochastically 
independent—t.e. P’((E&E’) = P’(E)P’(E)—and there is no reason to think 
that it can be so extended. In some cases, unlike this one, we also think that it 
would be silly to expect to have a rule covering the constraint—e.g. the 
constraint that E should have a probability different from 0.7. In view of 
this, it seems to us that the question of justification should be broached for 
each class of constraints separately. The first class of constraints contained 
exactly all those of form: give E posterior probability r. The second class was 
the enlargement to the form: give E posterior probability x. In these cases, 
justification of Simple and Jeffrey Conditionalization as unique solutions 
have been presented [van Fraassen, 1986; Hughes and van Fraassen, 1985]. 
The third one is the still larger class of constraints of form: make the posterior 
conditional probability of E given F, equal to x. This is the class characterized 
by the Judy Benjamin problem, to be investigated here. We shall look into 
“symmetry” “invariance,” ‘“‘consistency’’) requirements to narrow down 
the admissible rules. But in accordance with the above, we shall not impose 
requirements concerning extendability to larger classes of constraints. 


4. In the recent movie Private Benjamin, Goldie Hawn (playing the 
character Judy Benjamin) enters the army and during war games, she and 
her patrol are dropped in a swampy area which they have to patrol. I shall 
now continue the story of their exploits there without straying too far from 
the movie. The war games area is divided into the region of the Blue Army, 
to which Judy Benjamin and her fellow soldiers belong, and that of the Red 
Army. Each of these regions is further divided into Headquarters Company 
Area and 2nd Company Area. The patrol has a map which none of them 
understands, and they are soon hopelessly lost. Using their radio they are at 
one point able to contact their own headquarters. After describing whatever 
they remember of their movements, they are told by the duty officer, “If you 
have strayed into Red Army territory, the probability is 3/4 that you are in 
their Headquarters Company Area”. At this point the radio gives out. The 
platoon goes on to capture enemy headquarters. 

The first relevant partition is the cross classification Red/Blue; 
Headquarters/Second. The obvious prior P to assume assigns each the 
resulting four areas a probability of one quarter. But this is not the coarsest 
relevant partition for the constraint, which does not pertain to subdivisions 
of the friendly Blue Army territory. The coarsest description is therefore: 


A, = Red Second Company Area 
A, = Red Headquarters Company Area 
A, = Blue Army Region 
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prior: P(A,) = 1/4 = P(A,) 
P(A) = 1/2 
posterior: P’(A,|A, or A,) = 3/4 


Question: What will be Private Benjamin’s posterior probability that she is in 
the friendly Blue Army Region? 

That is equivalent to the question what her posterior probability function 
is; since we know how the Red Area is to be divided, we need only ask what 
the new proportion of Blue to Red will be. In audience surveys in classes and 
at lectures, the overwhelming preliminary reaction is that the probability of 
Blue should stay the same. As we shall see, neither INFOMIN, nor any 
other plausible rule agrees; and no acceptable rule could give that reply in 
the general case. Note that the general problem posed to Judy Benjamin is 
the simplest sort after those to which Simple and Jeffrey Conditionalization 
apply. 


5. Let us first state the problem in general. The initial probabilities are 
X = (xX), X2, X3) and the posterior must be in the class C = {9 = <yi, ya, y3>: 
yal(y1+¥2) = q} where q is the new conditional probability. We can express 
this equivalently in terms of odds, i.e. a ratio. The radio officer could have 
said ‘“You are three times as likely to be in Red HQ area than in Red 2nd 
Company area”. Thus C = {9 : y, = ry,} where the odds factor r is related 
to the conditional probability q by: r = q/(1—q); q = r/(r+1). 

It is as well to think about the problem in terms of odds rather than 
probability. This means that the numbers x, y; can be any non-negative 
numbers. Ifx,-+x,+x3 = 1 we call X a special case of an odds vector, namely 
a probability vector. Two odds vectors related by a constant of multiplication 
are equivalent; we shall just write “=?” for this. Thus x = <1, x2/x,, x3/x,> 
= <1, 8, t which expresses X in, we shall say, canonical form; then y = <1, r, 
t> and the problem is to determine t’ in terms of the other factors. This 
general problem has a number of “symmetries”? which must be respected. 


6. Peter Williams pointed out the reason why the rule ‘‘keep the probability 
of Blue the same” cannot be correct in general (see postscript, [van Fraassen, 
1981]). For if q = 1 instead of 3/4, then the radio officer’s announcement is 
equivalent to “You are certainly not in Red 2nd Co. area.” This case falls 
under the rule of Simple Conditionalization, and yields the result that Blue 
gets posterior probability 2/3. (For we delete one of the four initially equally 
probable areas; the remaining three are still equi-probable.) Similarly if 
q =o. Let us make that our first principle. 


I. Ifq = 1, the prior x is transformed by Simple Conditionalization on 
(A, or A;); if q = o by Simple Conditionalization on (A, or Aj). 


This means that if q = 1, we have the posterior y = <o, x2, x3) and if q = 0 
then ¥ = ¢x,, O, X3). 


II. If q equals the prior conditional probability (of HQ given Red), t.e. 
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q = q,=x,/(x,+x,) then all probabilities stay the same (posterior 
= prior). 


This second principle would have applied if the radio officer had said ““You 
are equally likely to be in HQ Co. area as in and Co. area; if you are in Red 
territory”, exactly what Judy would have said to herself before his message, 
which is therefore of no! help at all. 


7. Although J already eliminates the first candidate for a rule, J and II 
together do not suffice to eliminate the following proposal: let the probability 
of Blue change in direct proportion to the change in q. Call this the 
proportionality rule. 

In our example the rule would act as follows. The prior is ¥ = ¢1/4, 1/4, 
1/2) which is also the posterior y, if q = q, = 1/2. That posterior is y = <o, 
1/3, 2/3) if q = 1. As the increase in q varies from o to 1/2, the increase in x, 
varies from o to 1/6 therefore. For a case in between, say q = 3/4 = q,+1/4 
let the increase in x, be propositional: half of the maximum increase. Thus 
the answer, according to this rule is, if q = 3/4 then the probability of Blue 
becomes (1/2)+ (1/12) = (7/12). 

Stated in general for a probability vector x the rule is this: when q > q,, Y3 
= x3+k(q—q,)/(1—q,), where x, +k is the value of y, when q = 1. (That 
value is x3/(x,+x3).) Similarly if q < q, Ya = x3+m(q,—q)/q, where x3 
+m is the value of y; when q = 0 (i.e. x3-+m = x,/(x, +x;)). It is in some 
ways rather a nice rule, for it expresses y, as a continuous function of x4, X2, 
X3, and q. 

We consider this rule unacceptable, though for a reason which takes a 
little time to present. The general form of any such rule, if we state it in 
terms of the odds rather than the probabilities, is 


R(q) 
ae S, th—— (i, r, g(s, r, t)> 


where r = g/(1 +q), the posterior odds of A, to A,. What can we say about 
this function g, which it is the job of the rule to specify? 

Imagine that for q = 3/4 and for the three different prior odds vectors 
<1, 1, 1, (1, 1, 2), and <1, 1, 3> a certain rule gives posteriors (1, 3, 10>, 
<1, 3, §> and <1, 3, 15>. This does not look right; although the prior proba- 
bility of Blue—the sole difference between these cases—is expected to play 
some role in determining the posterior, it should do so in a “law like” 
manner. This remark expresses only a desideratum; while we shall now ex- 
plicate it in the form of a requirement of symmetry (or invariance) we do 
not pretend that it is a requirement of logical consistency. 

The most important relation between odds vectors is equivalence; this 
relationship is preserved by transformations of the form 


Ux = [uX U2X2, U3X3) uy > 0; 1 = I,..., N. 


That is, if x = y then Ux = Uj. Let us call these transformations, which are 
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of course familiar ones from linear algebra, uniform. They also preserve 
simple odds comparisons such as are expressed by a statement like “My 
odds for A, to A; are twice yours” [Hughes and van Fraassen, 1985]. In the 
problem at hand, the posterior must be a member of the constraint set C(q) 
= {¥:y2/(y, +y2) = q}. That set is invariant exactly under those uniform 
transformations U such that u, = u,. These transformations are therefore 
symmetries of our problem; they leave intact everything that we consider 
significant in the statement of the problem. We must require therefore, if 
essentially similar problems are to receive essentially similar solutions, that 
the following diagram must have equivalent odds vectors at each of its 
vertices: 





(q) 
«I, 8, t) a <1, T, g(s, T, t)> 
U U 
<u, ur, we(s, r, t)> 
<1, 8, (w/u)t> <1, r, g(s, r, (w/u)t)> 


‘This means really that for any positive number k, we must have g(s, r, kt) 
= kg(s, r,t). And that means in turn that g must have the special form 
g(s, r,t) = ty(s,r). (Proof: Define y(s,r) = g(s, r, 1).) Now in this odds 
formulation, t was x3/x, and t' was y3/y;; so we have found the principle: 


HI. The ratio x3/x, should change (to y3/y,) by a factor y(s, r) which is a 
function only of the initial odds s = x,/x, and the constrained odds r 


= y2/y1- 
Looking back to our proposed “‘proportionality”’ rule, we find that it violates 


III. For example with s = 1, r = 3 (ie. q = 3/4), we have the following 
cases: 


Prior Posterior Factor 
I, I, I 3, 9, to (10/13) = 3-333 
L, 1,2 5, 15, 28 (28/10) = 2.8 
I, 1,3 II, 33, 116 (116/33) = 3-51 


which clearly shows that the factor by which the odds of Blue to Red 2nd Co. 
changes is not a function of s and r alone. 


8. We are now beginning to get a good picture of what the rules should be 
like. They should take form 


R(q) 
£i, S, D—— Q, T, ty(s, r)> 


Principle II becomes: if q = q, i.e. if r =s, then y(r,s) = 1. Principle I 
yields: if q = 0, t.e. if r = o then y3/y, = x3/x,, so again y(r,s) = 1. Well, 
what about this as a third suggestion: }(r, s) is the constant function equal to 
1. This is the rule: always keep the odds of Blue to Red ee sil same. Since 


ON 
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this proposal treats the two Red areas quite differently, it should come as no 
surprise that this “constant factor” rule also violates a symmetry 
requirement. 

Imagine the radio-officer had said the exact reverse of what he did say, 
that is, “If you have strayed into Red Army territory, the probability is 3/4 
that your are not in the HQ Co. Area but in and Co. Area.” What difference 
would that make? Really none; it is the same problem as before; it is merely 
as if the HQ and 2nd Co. Areas in Red Territory have been relabelled or 
interchanged. The new probability for Blue should be no different. 

Generalizing this, we say that relabelling A, and A, as A, and A, should 
make no difference to the posterior probability for Az—provided of course 
we really rewrite the relabelled problem consistently. Posterior odds of r/1 
for A, to A, become posterior odds 1/r; also prior odds relabelled change 
from s/1 to 1/s. The odds of A; to A, and A, must also be changed 
accordingly: the prior probability of A, must now bear to A, the relation it 
used to bear to A;. Thus 


<1, S, o T, ty(s, r)> 
<x, t/s, ts», rr, (t/s)y(a/s, 1/1) 
R(q) 


are really the same problem, and so must have the same solution. That 
means the posterior probability for Blue should be the same in both cases: 


ve asd 
r+r+ty(s,r) x1+(1/r)+(t/s)}x(1/s, 1/r) 
That looks complicated, but reduces quickly to: 
IV. The function y described in III is such that 
y(1/s, 1/r).= (8/r)y(s, 1). 


Our third candidate for a rule, which set y (s, r) equal to l everywhere, clearly 
violates this requirement. For if p(s, r) = 1 then p(1/s, 1/r) = (s/r) according 
to IV. 





9. Looking back over this, we see that almost all the requirements have been 
either stated or restated in terms of odds, nl.: 


I. If r =o then p(s,r) = 1 
II. Ifr=s then y(s,r) =1 
IV. (1/s, 1/r) = (s/r)y(s, r) 


III justifies the notation %(s, r) in the description of the rule as transforming 
<I, s, t) into <I, r, ty(s, r)>. But one part of I has not yet been restated, the part 
pertaining to q = 1, which makes r = q/(1 —q) infinite. We cannot usefully 
talk about infinity as a ratio. If the function y(s,r) is continuous, we can 
reason that the case in question is the limit for large values of r. From I we 
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know then that y3/y, approaches x;/x,, t.e. ty(s, r)/r approaches equality 
with t/s. Thus 7(s, r) approaches equality with (r/s) as r increases to infinity. 


V. Ify is a continuous function of r and s then limit [}(s, r)—(r/s)] = o 
r> o. 


Our fourth prima facie candidate for a rule which suggests itself is therefore: 
y(s, r) = (r/s). But this violates I for r = o. 


You can’t squeeze blood from a stone. We have come to the end of the 
symmetries of the general class of Judy Benjamin problems. Let us now look 
at some rules that do satisfy all five principles. We shall examine three, each 
with a rationale. 


10. The INFOMIN rule: choose the posterior y such that the relative 
information I(y; ž) = Ly, log (y,/x,) is minimized. (Here j, x are probability 
vectors, t.e. their components sum to 1.) In the previous discussion note [van 
Fraassen, 1981] it was shown that this leads to the solution: 


q qa y 
Cr, s, t) istransformedinto( ; ( y). 
1—q_ \s(1—q) 


An equivalent way to state this, using r = q/(1—q) is to say that 7(s, r) = 
(r/s)*. For the initial Judy Benjamin example, with prior odds <1, 1, 2> 
and q = 3/4 so r = 3 and s = 1, we have the posterior odds <1, 3, 2(3)) = 
<1,3,4-55> which gives a posterior probability for Blue of 0.532 
approximately. 

This rule clearly satisfies all five requirements. The relative information 
function I(¥; X) is always non-negative, equalling zero if and only if X = ï; it 
is not symmetric in x and y. 


11. The MTP rule. [Hughes and van Fraassen, 1985]. This rule was 
constructed in analogy with the so-called Projection Postulate in quantum 
mechanics. The probability functions on A4, A2, A; are represented by the 
points on the unit sphere: ¥ = (x,,x,.,x;) with x,?+x,?+x,?=1. Thus 
the probabilities are not the components of the vector, but their squares. 
The measure of closeness or nearness of two vectors x and y is their scalar 
product (¥. X) = y,x, + y2X2-+y3X; or its square. The MTP rule says: choose 
the posterior such that this nearness between prior and posterior is maximized. 
In quantum mechanics, the square of the scalar product (x. y) is used to 
represent the transition probability between states represented by x and J; 
hence the name of the rule Maximum Transition Probability. In our case, in 
which the components of the vectors are square roots of probabilities, which 
we can take as non-negative by convention, it makes no difference to the rule 
whether we use (y. X) or its square as nearness measure. The function takes 
its maximum value 1 if and only if X = J; its minimum value zero exactly if X 
and ¥ are orthogonal (i.e. for each i = 1, 2, 3 either x; or y; equals zero); and is 
symmetric in X and j. 
HH 
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In the generalized Judy Benjamin example, the posterior probabilities for 
A,, Ag, A; can be written in the form ¢a7(1 — q), a2q, 1 —a*> which receives as 


its representation the vectory = <a / 1 —q, a/a ,„/1—a°> where we can take 


a to be positive. To maximize the scalar product (y.x)= xay 1—q+ 
xaq +x;./1—a? we set its derivative with respect to a equal to zero, and 
find the solution for a: 


a = (X/t— Gt Xey/Q)? (xs? + /1 G+ x25/)’). 


It is easy enough to express this solution in odds form (thus returning to our 
earlier representation of states of opinion): write K = (x,./1—q +x2,/q)? 
and let p; =x, the prior probabilities. Then a?= K/(p;+K). The pos- 
terior probabilities which we wrote above as a2(1—q), a’q, 1 —a? are therefore 
odds equivalent to 1, q/(1—q), (1—a”)/a*(t —q) which is quickly seen to be 
<1, r, p + K(1—q)>. If we now write, as we did before, t = p,/p, and t’ 
=ty(r, s) as the posterior odds of A; (blue) to Aj, then y(r, s) = p,/K(1 —q). 
That expression too can be considerably simplified, to yield the elegant 
reformulation: 


I 
prior odds <I, s, t transforms into (zs ( a J) 
I+./rs 


Again the solution satisfies all five requirements. 

The initial example, with prior odds (1, 1, 2>, 8 = 1, andr = I, now has 
posterior odds <1, 3, 2(4/(1 +./3. ))?> = 1, 3, 4.28) which yields a posterior 
probability for Blue of 0.517 approximately. 


12. The MUD rule. It is easiest to begin with the formal definition: (s, r) 
equals 1 if r < s and equals (r/s) otherwise, for this is intuitively the simplest 
solution that fits all five of our principles. It is suggested directly by I, II, and 
V: we set p(s, r) = r/s for all r > s to satisfy V, and set (s,s) = 1 to satisfy IT, 
and (8, 0) = 1 to satisfy I. Then we assign 1 to y(s, r) for all remaining values 
of r, those between zero and one. This satisfies IV along the way, for we 
have r > s if and only if (1/r) < (1/s). Hence if r > s, IV is satisfied because 
y(s, r) = (r/s) = (r/s)y(1/s, 1/r). If r<s then (1/r) > (1/8) so y(1/s, 1/r) 
= (1/r)/(/s) = (8/r) = (s/r)y(s, r). 

The name derives from its rationale. Think of the relevant propositions 
represented by a Venn diagram, with mud heaped on it. The proportion of 
mud on each area represents the probability of the corresponding 
proposition. Now the MUD rule says: to satisfy a given constraint on the 
posterior, remove as little mud as you need to, without redistributing the 
remaining mud. In the case of Simple Conditionalization on proposition E, 
this is clearly achieved by only removing the mud on E, and leaving the 
remainder untouched. In Judy Benjamin’s case, when the announcement is 
made that the odds of Red HQ to Red and Co. are r: 1 we look first to see if 
the prior odds s: 1 are greater or smaller. If they are greater, we remove mud 


A Problem for Relative Information Minimizers, Continued 461 


only from Red HQ, till the correct proportion is reached. (Any more 
complicated procedure, with removal of mud from the other areas will 
increase the total mud removed. For if we take away mud from Blue, we have 
done nothing to change the odds in question, and if we remove mud from 
Red 2nd Co., we have made the initial odds, which were too large, still 
larger.) If the initial odds were smaller, we remove mud from Red 2nd Co. 
only. Note that if s was too large, our procedure leaves the ratio of the 
probability of Blue to that of Red 2nd Co. (i.e. of A; to A,) the same, which 
means that the factor y equals 1. 

The initial example with prior odds <1,1,2>, s=1, r= 3 is thus 
transformed into <1, 3, 2.(3)> = <1,3,6> with a posterior probability for 
Blue of 0.6. 


13. We now have three rules, each with some intelligible rationale, and all 
satisfying the five requirements. It is time therefore to think of how such 
rules might be evaluated as better or worse by general criteria for what such 
rules are meant to do. A person employing such a rule, has a method for 
changing his opinion systematically, when he accepts new constraints on it 
in response to his experience. There may be various such criteria; we have 
thought of two. The first is that it must be possible to learn from experience, 
and fairly efficiently, in those cases in which the constraints imposed reflect 
an objectively correct probability distribution. The second is that in a case in 
which the data are confusing, so that the subject has occasion to change his 
mind a number of times, the consequences on his state of opinion should not 
be too drastic. 


14. Learning rate. Let us suppose that A,, A2, A, pertain to three horses 
who will race against each other on some future date. The objective chances 
of the first, second, and third horse winning are a,b,c. Our subject sees 
practice races between these horses two at a time, and in each case imposes 
on his posterior the correct constraints of the probability that the i horse 
will win, on the supposition that the i or j® horse will win. Thus at each 
practice race, he is in a Judy Benjamin situation. At the first, supposing his 
prior odds are 1, 1, 2 and he sees the first and second horse race, he imposes 
that constraint that P’(A,|A, or A,) = b/(a+b) = q. Then he witnesses the 
second practice race and imposes constraint P’’(A,|A; or Aj) = c/(a+c), 
starting from his new prior P’ of course. And so on, for as long as you like. 
How long does it take before his probabilities have come to within say 0.1 or 
0.0001 ofa, b, c? Itis easy to write a program for this on a personal computer; 
some representative results are shown overleaf. 

The numbers under the names of the rules indicate the number of steps 
required in this process to reach the listed degree of accuracy. Inspection of 
other examples and greater degrees of precision have not upset the 
impression gained from this list, namely that INFOMIN and MTP do 
about equally well, while MUD does much better. 
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within 0.1 within 0.0001 
Prior Target INF MTP MUD INF MTP MUD 

LII 1, 10, r0 3 3 3 3 3 3 
I,II I, IO, 100 3 3 3 6 6 3 
3,354 1,2,7 6 6 3 9 9 3 
7,2,1 1,2,7 6 6 3 II 12 3 
5,6,7 7,8,9 2 2 I 9 9 3 
1,5,25 1,2,4 6 6 2 12 12 2 


15. Wavering effect. Suppose Judy Benjamin had accepted constraint 
q = 3/4, then because of new data the different constraint q = 1/4 (starting 
from her new prior), then q = 3/4 again, and so forth. Each time, on any of 
our three rules, her probability for Blue goes up. How soon does it reach 
0.99? We would prefer the rate of increase in the probability of Blue not to be 
too fast as she wavers, for a while, between high and low values q; and q3 for 
the conditional probability of HQ, given Red. Again it is easy enough to 
program this situation; here are some results from our three rules. 
Prior q, q2 INF MTP MUD 


I, 1,2 2/3 1/3 22 41 9 
I,1,2 3/4 1/4 Il 18 6 
I, 2,3 2/3 1/3 22 42 9 
1, 2,3 3/4 1/4 n 18 7 
2,1,3 2i3 1/3 a1 4I 8 
2,1,3 3/4 1/4 10 18 6 
5,6,7 2/3 3/3 24 45 9 


The disparity is much greater between the rules in this test, with MTP 
faring by far the best, and INF significantly better than MUD. 


16. It is surely significant, and disturbing, that INFOMIN did not come 
out the winner in either test. Our conjecture is that the perceived superiority 
of INFOMIN, so far evident in the literature, must derive from the fact that 
it can handle linear constraints in general. This satisfies the desideratum of 
having some definite rule for as many problems as possible. But in the 
absence of a conviction that the large class of problems in question must 
have a solution in the form of a function of the parameters which appear in the 
statement of the problem alone, this desideratum cannot suffice as a 
justification. We hope that a more systematic inquiry into the idea of 
performance testing for proposed rules (like our simulated horse races and 
the wavering test) may lead to more positive virtues to be cited in support of 


such rules. 
BAS C. VAN FRAASSEN 


Princeton University 
R. I. G. HUGHES 
Yale Oniversity 
GILBERT HARMAN 
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A HUMAN FACTOR IN ‘GOOD’ EXPERIMENTS 


In a paper entitled “What makes a ‘good’ experiment?” Franklin [1981] 
distinguishes between experiments which are ‘conceptually important’ and 
those which are ‘technically good’. ‘Conceptually important’ experiments 
may permit distinction between two or more theories, may strongly 
corroborate issues central to a particular theory, or may provide new data 
which require changes to an existing theory or elaboration on a new one. 
“Technically good’ experiments were later divided into those which involve 
‘construction of an entirely new apparatus’ and those which depend on ‘an 
improvement in existing apparatus’ (Lai [1984]: “The philosophical 
relevance of technically good experiments”). One implication from these 
discussions is the need to consider not only the conceptual design of an 
experiment but also the technical capabilities of the apparatus. However, 
there is an additional factor which may contribute to successful design and 
accomplishment of ‘good’ experiments in the biological sciences, especially 
human physiology. This is the use of the experimenter himself as the subject 
for the experiment. No implication is intended that ‘self experiments’ are 
‘better’ than those done on other subjects merely that they can permit the 
examination of crucial aspects of some theories which otherwise would have 
remained untested. Obviously ‘self experiments’, in which subjective bias 
may play a role in construction, execution and interpretation of the study, 
require especially careful scrutiny. Some examples from human neuro- 
physiology illustrate the potential contribution from ‘self experiments’ 
to the conduct of ‘good’ science. 

Neural networks within the ponto-medullary region of the hindbrain (so 
called ‘inspiratory’ and ‘expiratory’ centres) automatically adjust breathing 
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according to prevailing metabolic conditions. An increase within the blood 
of metabolically produced carbon dioxide, (and/or a reduction in oxygen) 
enhances the activity of these centres which increases the rate and depth of 
breathing and thus restores normal levels of carbon dioxide and oxygen. 
Separate networks in the motor parts of the cerebral cortex permit voluntary 
control over breathing movements. For many years debate surrounded 
whether the automatic rhythmic activity of the ponto-medullary respiratory 
centres directly gives rise to the sensation of breathlessness. (This sensation 
of a ‘difficulty’ in breathing frequently accompanies disturbances of the 
cardiorespiratory system.) How could such a fundamental question be 
addressed in a critical experiment? Professor E. J. M. Campbell allowed 
himself to be completely paralysed by an injection of curare. When all the 
skeletal muscles were paralysed (including the respiratory ones) the 
rhythmic neural activity of the respiratory centres would not only have 
continued, but it would have increased as a results of the build-up of carbon 
dioxide, yet Campbell noted “I did not know what my inspiratory centre 
was doing; I felt no rhythmic or continuous sensation of any sort in my chest 
or my head” (Campbell [1966]). Suggestions that the activity of the 
respiratory centres (or changes within the blood) directly evoked the 
sensation of breathlessness could no longer be sustained as a result of this 
dramatic experiment. Another observation during the experiment, namely 
the sensation of ‘effort’? which accompanied voluntary attempts to breathe 
with paralysed muscles, has provided the backbone for current theories on 
the nature of the sensation of breathlessness (Gandevia [1982]). 

The neurophysiological mechanisms which permit the positions and 
movements of the joints of the body to be perceived have been debated for 
decades without consensus. The term ‘kinaesthesia’ is given to these 
sensations. Specific sense organs (or ‘receptors’) within the skin, the joints 
or the muscles and tendons may, on theoretical grounds, contribute to this 
sensation. (The basis for this assertion is that each of these classes of receptor 
is activated in a specific way by movement of limbs.) Prior to 1972, the 
conventional wisdom was that only receptors within joints mediated 
kinaesthetic sensation. Since then much evidence, some of it indirect, has 
revealed that intramuscular receptors can directly contribute to this 
sensation.! A crucial experimental result in support of this reversal of 
opinion would be that a direct pull upon the tendon of a muscle evoked a 
distinct kinaesthetic sensation. This manoeuvre can be done so as to 
lengthen the muscle and thereby increase the activity in its receptors 
(without movement of the joints or skin). This conceptually simply 
experiment has the power to establish conclusively the ‘kinaesthetic 
sentience’ of muscles. It has been performed by four groups of investigators 


1 A review by McCloskey [1978] contains details of the historical background to this debate and 
describes the studies published in 1972 which first suggested a kinaesthetic role for 
specialized muscle receptors. 
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and in all but one report the ‘sensibility’ of muscles has been confirmed 
although the topic remains controversial.’ The importance of the scientific 
argument surrounding this question has been sufficient for both a 
physiologist (McCloskey et al. [1983] and a hand surgeon (Moberg [1983]) 
to arrange for manipulation of their own surgically exposed (and even 
severed) tendons under local anaesthesia. Notwithstanding some residual 
disagreement about the interpretation of the ‘self experiments’ recently 
conducted in this field, incisive studies involving the experimenter as a 
subject have been conducted. 

The ability to record the activity of single human nerve fibres in conscious 
subjects required modifications of apparatus and techniques used in animal 
experimentation (Hagbarth and Vallbo [1968]). Development of ‘human 
microneurography’, which has now been used in about a dozen laboratories 
throughout the world, took months of self-experimentation and the 
unsuccessful insertion of many microelectrodes into the peripheral nerves of 
the Swedish pioneers of the technique. The subsequent studies of the 
control of the nerve fibres which innervate muscle receptors, the relation 
between activity in nerve fibres and the sensation evoked, and even the 
neural control of the circulation which employed the technique (and often 
the experimenter) have been of considerable value.? These studies have 
addressed critical issues in the areas of movement performance, sensation, 
and autonomic control. Some of them have doubtless led to further 
fundamental experiments which could only be performed in experimental 
animals This example of a technical development which was the vehicle for 
conceptually important experiments in neurophysiology required ‘self 
experimentation’. Other areas of the biological sciences also contain 
valuable examples of such experiments. 

‘Self experimentation’ is not an intrinsically laudable venture unless it 
permits the conduct of a good experiment as defined by Franklin [1981]. In 
contrast to the examples given above, there are notable instances of 
potentially crucial studies in which it may have hindered rather than helped 
the development of knowledge. The reports of the qualitative changes in 
sensation following deliberate surgical transection of a peripheral nerve in 
the experimenter’s own arm (Head [1905], Rivers and Head [1908]) are a 
classic example of the dangers of introspection. The division of all cutaneous 
sensations according to epicritic and protopathic qualities resulted from this 
single experiment and it took several decades to overturn (see Walshe 
[1942]). 

Scientific research involving healthy human subjects should ideally 
address questions which cannot be investigated by animal experimenta- 


1 A fuller account of the results of these experiments can be found in the original reports by 
Moberg [1983] and McCloskey et al. [1983]. For subsequent debate and a study which results 
directly from this controversy see Gandevia [1985]. 

? For an overall review of the data obtained with this technique see Vallbo et al. [1979]. 


466 S.C. Gandevia 


tion (because, for example, of the requirement for a verbal description of 
an event as in the Campbell experiment) or questions which require 
complimentary investigation in both animals and human subjects. Such 
questions will always remain, as will the role for the ‘self experimenter’ in 
the conduct of technically and conceptually ‘good’ science.' 


S. C. GANDEVIA 
University of New South Wales? 
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| Although a consideration of the motivation behind the self experiments described here is 
beyond the scope of this essay, many of the experimenters value a little knowledge of 
something ‘high’ above much knowledge of something ‘low’. A discussion of this hierarchy of 
knowledge and its elaboration by Aristotle and St. Thomas Aquinas is given by Schumacher 
[1977]. I am unaware of the previous application of the concept to biological evolution. 

? 'This work was carried out while a visitor in the School of Medicine, University of Auckland. 
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SEPARABLE HIDDEN VARIABLES THEORY TO EXPLAIN 
EINSTEIN-PODOLSKY-ROSEN PARADOX 


A realist separable hidden variables theory in conformity with Einstein’s principle of 
causality is developed in this paper to explain the Einstein-Podolsky-Rosen paradox, 
and the experimental results (including those in Aspect’s four polarizers experiment) 
obtained so far with a view to test the non-separability of quantum mechanics. 


Introduction 

The Variables v, and v, 

The two polarizers experiment 

The rapport between the polarizers I, (a) and I, (b) 

The variables v, and v, are separable hidden variables 

Aspect’s four polarizers experiment 

For Aspect’s four polarizers experiment quantum theory and the hidden 

variables theory give the same predictions 

8 An experiment to rule out the realist separable hidden variables theory 
g Other Einstein-Podolsky-Rosen situations 

to Plausibility 


NAM kh WD m 


I INTRODUCTION 


Einstein-Podolsky-Rosen [1935] have pointed out a feature of quantum 
mechanics, which was later termed the ‘non-separability’ of quantum 
mechanics; Bohm [1957] suggested an experiment involving the 
measurements of the polarizations of correlated photons as an experimental 
verification of such ‘non-separability’. Bell [1964] pointed out the crucial 
importance, in any such experiment of changing the settings of the 
polarizers which measure the polarizations of the correlated photons ‘during 
the flight of the photons’. If in an experiment in which the settings of the 
polarizers are changed during the time of flight of the correlated photons (as 
for instance by rotation of the polarizers), the correlations predicted by 
quantum theory are still verified, that would clearly be an important 
experimental proof, not merely against Bell’s locality, but also against 
Einstein’s causality, and would clearly rule out separable hidden variables 
theories as an explanation of the EPR paradox; and we may have to admit 
that there are at the level of supplementary parameters faster than light 
influences. Aspect [1976] proposed an experiment involving four polarizers 
which he obviously considered to be equivalent to such an experiment. 
Aspect [1976] has stated that the proposed experiment is interesting ‘in that 
it embodies a device for changing orientations of the analyzers in a time 
comparable to the time of flight of the photons’, and that ‘a result consonant 
with the quantum theory predictions would imply the rejection of separable 
hidden variables theories.’ This proposed experiment has by now been 
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performed (Aspect [1982]) and its results are in conformity with quantum 
theory predictions. 

It is therefore of considerable theoretical interest that a realist separable 
hidden variables theory can be developed which while conforming to 
Einstein’s principle of separability still leads to the same predictions as those 
of quantum theory for the experiment performed by Aspect [1982], clearly 
showing that Aspect’s four polarizers experiment is not really equivalent to 
an experiment in which the setting of each particular measuring instrument 
is actually changed, through rotation of the polarizers, during the time of 
flight of the approaching photons. 


2 THE VARIABLES v, AND v, 


Every photon in a beam emerging from a polarizer in orientation p is 
characterized by the variable v, = p in the sense that each such photon will 
pass a polarizer in orientation p and will not pass a polarizer in orientation 
p+7/2. To explain the behaviour of a photon in such a beam when faced by a 
polarizer in orientation a, we postulate the existence of a hidden variable v2, 
whose values (o or 1) are induced by the polarizer in the photon, such that if 
Uza = O, it will pass the polarizer on meeting it, and if v3, = 1, it will not pass 
it. Obviously v24 is a probabilistic variable in the sense that while its value (o 
or 1) cannot be predicted for any particular photon in the beam, we know 
that if a sufficiently large number of photons from the beam are selected in a 
random manner, the proportion of those with v2, = o in these photons is 
cos? (p—a) in conformity with Malus’ law; and the remaining photons have 
Uz_ = I. 

We assume that every polarizer at rest is surrounded by a field of influence 
(a function of its orientation) which extends up to a critical finite distance 
from it; and that as soon as a photon approaches it within this distance, a 
value of v, (o or 1) is induced in it, just as, as soon as a meteor enters the 
gravitational field of the earth, its path gets deflected. Any change in the 
orientation of the polarizer is assumed to cause a change in this field of 
influence, such a change travelling outwards from the polarizer with a 
velocity not exceeding that of light. Bell [1964] had made a conjecture that 
the quantum mechanical predictions might be of limited validity, applying 
only to experiments in which the settings of the instruments are made 
sufficiently in advance to allow them to reach some mutual rapport. The 
fields of influence of the polarizers postulated above can, as shown below, be 
the mechanism through which such rapport is established. 


3 THE TWO POLARIZERS EXPERIMENT 


An atom which when raised to a higher energy level, first emits a photon and 
falls to an intermediate energy level, and then emits another photon and falls 
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to its original level is a convenient source of correlated photons. Consider a 
source of light in which such cascading atoms are producing pairs of 
correlated photons. If the two correlated photons produced by a cascading 
atom fly in exactly opposite directions, and their polarizations are measured 
by polarizers placed in their paths, such an experiment can test the quantum 
theory predictions regarding such measurements. Consider an experiment 
in which photon 1 created first by the cascading atom meets a polarizer I, (a) 
in orientation a, and the corresponding photon 2 produced next by the 
cascading atom travelling in the opposite direction meets the polarizer [,(b) 
in orientation b. Suppose both the polarizers are within the finite critical 
distance from the atoms emitting the photons. Then as soon as photon 1 is 
created, J; (a) will produce in it the value o or 1 of v2,, and as soon as photon 2 
is created, J,(b) will create in it the value o or 1 of vay. 

Considerations of conservation of total angular momentum suggest that 
the polarizations p and p’ of photon 1 and the corresponding photon 2 
respectively must be related. We assume that photon 1 is created with the 
same probability with any possible value of p, (i.e. v, = p,) and in this 
photon 1, immediately after its creation, the polarizer I,(a), being within the 
finite critical distance from the photon, creates the value o or 1 of vz, which 
decides if on meeting J,(a) it will pass it or will not pass it. We assume that 
the polarization p’ of photon 2 is related not to the polarization p of photon 1, 
but to the value o or 1 of vza induced in it by the polarizer I,(a). We assume 
that in the o—1-0 case, p' = a, if vaa = 0, and p’ = a+7/2, if v24 = 1. In the 
1-1-0 case, p’ = a+ 2/2, if vz, = 0, and p’ = a, if v24 = 1. Thus the value of 
the variable v, for photon 2 is related to the value of the variable v3, of the 
corresponding photon 1. It is in a photon 2 with such a value of variable v, 
that the polarizer I,(b) will produce the value o or 1 of vy. 

All photons 1, corresponding to such photons 2 as have v, = a would (in 
the o-1-0 case) pass I,(a); and among the photons 2 which have v, = a the 
proportion of those which have v2, = 0 will be, in accordance with Malus’ 
law cos (a—b). It is known that photons 1 travelling towards I,(a) constitute 
unpolarized light, that is, among the photons 1, the proportion of photons 
with polarization lying between p and p+dp is just dp/ax. To get the 
coincidence count rate we therefore have 


x2 
R(a,b) = Ro Í cos? (p —a)' cos? (a—b) + dp/an (1) 
o 
and in the 1—-1r—o case we have 
Rf{2 
R(a,b) = Ro af cos? (p—a)* sin? (a—b)- dp/2x (2) 
0 


where Ro is the coincidence count rate in the absence of the polarizers J, (a) 
and I,(b). These rates agree with the predictions of quantum theory. 
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4 THE RAPPORT BETWEEN THE POLARIZERS I[,(a) AND [,(5) 


In conformity with Einstein’s causality, we assume that when any polarizer 
is placed at any point in space and set in a particular orientation, the field of 
influence corresponding to that orientation spreads outwards in space with a 
speed less than or equal to that of light. The spread of the field of influence of 
each of the polarizers J, (s) and J,(b) up to the other polarizer and the source 
of light occurs before the start of the experiment, and thus are the signals 
which pass between them, the signals by means of which the polarizers 
establish the mutual rapport conjectured by Bell [1964]. 


5 THE VARLIABLES v, AND v, ARE SEPARABLE 
HIDDEN VARIABLES 


Aspect [1976] has expressed the principle of separability for the experiments 
under consideration thus: the setting of a measuring device at a certain time 
(event A) does not influence the result obtained with another measuring 
device (event B) if the event B is not in the forward light cone of A (nor does 
it influence the way in which particles are emitted by a source if the emission 
is not in the forward light cone of event A). In our theory the setting of 
polarizer I,(a) in orientation a does influence the emission of photon 1, but 
since this setting of the polarizer J,(a) is made sufficiently in advance of the 
start of the experiment, the emission of photon 1 lies in the forward light 
cone of event A. The setting of I,(0) in orientation b and the emission of the 
photon 1 do influence the emission of the corresponding photon 2, but the 
emission of this photon 2 lies in the forward light cones of the setting of J,(b) 
in orientation b and of the emission of photon 1. So the variables v, and v, 
employed in our theory are clearly separable hidden variables. 


6 ASPECT’S FOUR POLARIZERS EXPERIMENT 


In Aspect’s [1976], [1982] four polarizers experiment, photons 1 from 
cascading atoms are directed by a commutator C, towards polarizer I,(a,) 
in orientation a, or towards polarizer J,(a2) in orientation a,. The cor- 
responding photons 2 flying in the opposite direction are directed by the 
commutator Cp towards the polarizer JI, (b;) in orientation b, or towards the 
polarizer JJ,(b2) in orientation 6,. The two commutators work 
independently and in a stochastic manner, and do not change the 
polarizations of the photons. The four joint detection rates (14, IL), 
(L, Ih), (12, 11), (12, TI) are monitored, and the orientations a4, a2, b1, b2 
are not changed in the whole experiment. 

In such an experiment since there are two polarizers inducing values of 
V2a1 and V242 in the same photon 1, it becomes necessary to formulate some 
hypothesis as to the values of the variable v, produced in photon 1. If we 
adopt the hypothesis that in such a situation the values o or 1 of v, is induced 
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by each of the two polarizers independently of each other, but in each case in 
accordance with Malus’s law, clearly the quantum theory predictions 
regarding the coincidence counting rates cannot be realized. But we can 
assume that the two polarizers J,(a,) and I,(a,) have established a mutual 
rapport with the result that as soon as any photon 1 is created with 
polarization p, one of the two polarizers (say I,(a,)) first induces in it the 
value o or 1 of v, (say of v2,;), in accordance with Malus’s law, and next the 
other polarizer (say I,(a,)) induces in it the opposite value 1 or o of v, (say of 
242). Thus in the beam of photons 1 approaching the commutator C4, the 
probability that a photon 1 has v2,,; = 0 is $; and if a photon 1 has vza; = 0, it 
necessarily has (with probability 1) v2.2 = 1; if a photon 1 has v2,2 = 0, it 
necessarily has v24; = o. We also assume that the corresponding photon 2 is 
created with value of v, related to a, if the photon 1 has v.,, = o and related 
to a, if photon 1 has vz,,=0. Thus if photon 1 has vz, =o, the 
corresponding photon 2 has v, = a; in the o—1—0 case, and has v; = a, +n2/2 
in the 1-1-0 case. If photon 1 has v2,. = o, the corresponding photon 2 has 
v, = a, in the o-1-0 case and has v, = a, +2/2 in the 1-1-0 case. With these 
assumptions the hidden variables theory gives, for Aspect’s four polarizers 
experiment, the same predictions as given by Quantum theory. This is 
shown in the next section. 


7 FOR ASPECT’S FOUR POLARIZERS EXPERIMENT 
QUANTUM THEORY AND THE HIDDEN VARIABLES THEORY 
GIVE THE SAME PREDICTIONS 


In Aspect’s four polarizers experiment, the commutators C4 and Cy do not 
change the polarizations of the photons; also they work independently and in 
a stochastic way. So the proportion of photons 1 with v34, = o is 4 in the 
beam of photons 1 approaching C4, and this proportion is the same in each 
of the two partial beams approaching J,(a,) and J,(a,); and the same is true 
regarding the proportion of photons 1 with v32 =0 in the beam 
approaching C,, and in the two partial beams approaching J,(a,) and I,,(a2) 
respectively. Consider the beam of photons 2 corresponding to those 
photons 1 which had the value v,,,;=0 and were directed by the 
commutator C4 to polarizer J,(a,) and so had passed it. In the o—-1—0 case the 
polarizer I[,(b2) would induce the value o of the variable v2, in a proportion 
cos? (a; —b,) of such photons 2 (since each of the photons 2 in this beam has 
v, =a); and of these a proportion, (say q) would be directed by the 
commutator Cp towards IT,(b2). So if Ro(a,, b2) is the coincidence counting 
rate in the absence of the polarizers, then with the polarizers in position, the 
coincidence counting rate R(a,,b,) would be Ro(1/2)* cos? (a, —b,) in the 
o-1~-0 case and would be R(1/2)- sin? (a; — 52) in the 1-1-0 case. Similarly 
for any coincidence counting rate R(a;, b;) where 1 = I or 2,7 = 1 or 2, the 
coincidence counting rate would be the same as that predicted by quantum 
theory. 
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The experimental results (Aspect [1982]) in Aspect’s four polarizers 
experiment are in conformity with the predictions of quantum theory, but 
these results clearly do not rule out the separable hidden variables theory 
developed above. 


8 AN EXPERIMENT TO RULE OUT THE REALIST SEPARABLE 
HIDDEN VARIABLES THEORY 


It seems a natural assumption that the postulated physical influence of a 
polarizer on a photon should decrease with increasing distance between the 
polarizer and the photon, and beyond a certain finite critical distance should 
become inappreciable. So if the distance between the polarizer and the 
source of the photons is made sufficiently large, the hidden variables theory 
should predict deviations from the predictions of quantum theory. With 
sufficiently large distances between the cascading atoms and the polarizers 
I (a) and I,(b2), and no other polarizers present within the critical finite 
distance from the cascading atoms, the only correlations between photon 1 
and the corresponding photon 2 would be that in the o-1-0 case p’ 
(the polarization of photon 2) = p (the polarization of the corresponding 
photon 1), and in the 1-1-0 case it would be p’ = p+/z. So the coincidence 
counting rate would be in the o—1-0 case 


{2 
R(a,b) = 4Ro | cos (p—a) ` cos? (p—b)dp/2m (3) 
o 
and in the 1—1—o case the coincidence counting rate would be 
zj2 
R(a, b) = 4Ro f cos? (p—a)* sin? (p —b)dp/27. (4) 
o 


Unfortunately, we have at present no idea as to what would be a 
sufficiently long distance for this to happen. Aspect [1981] has reported that 
no significant change in the results was observed with source polarization 
separations of up to 6.5 m., i.e. to four coherence lengths of the wave packet 
associated with the lifetime of the intermediate state of the cascade. But 
clearly neither the coherence length, nor indeed the absorption length 
suggested by Franson [1982] is much of a guide to calculate the distance 
beyond which the physical influence of the polarizer on the photon should 
become inappreciable. 

However, if the polarizers J,(a,) and IJ,(b,) are made to rotate in such a 
manner that while a, and b, change with time, the difference (a; —b;) 
remains constant, the hidden variables theory should predict deviations 
from the quantum theoretical predictions. It is natural to assume that the 
physical influence of a polarizer on a photon should change, if the 
orientation of the polarizer is changed, and that any such change in the 
influence should travel outwards into space from the polarizer with a speed 
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less than or equal to that of light. Therefore, if the rate of rotation is fast 
enough, the angles a,, b, which induced the hidden variables v34, and v5, in 
the photon 1 and photon 2 respectively would be appreciably different from 
the orientations a and b}, of the polarizers J,(a,) and 1,(6,) when the 
photons meet the polarizers. In this situation, appreciable deviations from 
quantum theory, predictions should be expected in the hidden variables 
theory. In particular, when the difference a, —b, is always kept o during the 
rotation of the polarizers J} and J,, in the o-1-0 case, the correlations 
between photon 1 passing polarizer J,, and the corresponding photon 2 
passing J, would still not be 100 per cent, in the hidden variables theory. If 
in such an experiment, the quantum mechanical predictions are still 
verified, that would be a good reason to rule out such a separable hidden 
variables theory. 


9 OTHER EINSTEIN-PODOLSKY-ROSEN SITUATIONS 


If we make similar assumptions about other measuring instruments, the 
quantum theory predictions in other EPR situations are similarly predicted 
and explained by a realist separable hidden variables theory, so long as the 
measuring instruments are kept within the critical finite distance from the 
place where the quantum systems, later measured, had in the past 
interacted, and the orientations of the measuring instruments is left 
unchanged throughout the experiment. For instance consider a system of 
spin $ particles in the singlet state with particle 1 moving in the + 2 direction 
and particle 2 moving in the —2z direction towards Stern-Gerlach 
instruments A and B which measure their spins in the a and b directions 
respectively. Considerations of conservation of angular momentum suggest 
that the spins of the particles 1 and 2 are related being opposite of each other, 
in the sense that the Stern-Gerlach instruments with the same orientations 
would find spins of a pair of particles 1 and 2 to be opposite in directions. 
Considerations of symmetry suggest that the spins of particles 1 (and so also 
for particles 2) are distributed in all directions uniformly in the sense that a 
beam of particles 1 is measured by a Stern-Gerlach instrument in any 
orientation would be found to consist of half the number of particles with 
spin +4, half with —4. We assume that instrument A induces in particles 1 
the values o or 1 of vz, which determines whether the inhomogeneous field in 
it will treat the particle as one of spin 4 or —4 respectively, and opposite 
values of vza in the corresponding particle 2, and instrument B induces in 
this particle 2 values of v2, opposite to its vz, values (o or 1) with probability 
sin*4(a—b). The spin measurements by A and B on particles 1 and 2 
respectively would then be correlated in the manner predicted by quantum 
theory. A rotation of any one of the two measuring instruments during the 
time a particle leaves the place of origin and meets the instrument would 
alter these correlations. 

However, there is an important difference between EPR situations with 
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photons from cascading atoms flying in opposite directions and other EPR 
situations in which the quantum systems being measured are approaching 
the measuring instruments with velocity less than that of light. In these 
other cases it is theoretically possible that the quantum theory predictions of 
measurements correlations are verified because of signals travelling with 
velocity equal to or less than that of light passing between the two measuring 
instruments during the time between the two correlated measurements; and 
whenever the space time locations of the measurements are such that a signal 

‘travelling with a speed not exceeding that of light cannot connect them, 
experiment may show the quantum theory predictions to fail. In the case of 
photon polarization correlations with the correlated photons flying in 
opposite directions from the cascading atoms, with the polarization of 
photon 1 measured after the corresponding photon 2 is emitted, such 
alternative theoretical explanations (through signals passing between the 
polarizers between the times of the two correlated measurements) become 
impossible. 


Io PLAUSIBILITY 


There are some types of physical influence that a macroscopic body like a 
measuring instrument exerts on a quantum system only when the quantum 
system makes actual contact with the macroscopic body or actually enters it. 
Thus, photons in a beam of light with polarization g, incident on a calcite 
crystal change their direction of motion and start travelling along the 
direction of the ordinary or the extraordinary ray only on entering the calcite 
crystal. Similarly a spin 4 particle travelling along the y axis develops a 
component of velocity along the +g or —s direction only after it enters a 
Stern-Gerlach instrument in suitable orientation placed along the y axis. 
But there is no reason to imagine that the physical process which determines 
whether a particular photon with polarization g will, on entering the calcite 
crystal, travel along the ordinary ray or the extraordinary ray (or the physical 
process which determines whether a particular spin 4 particle, on entering 
the Stern-Gerlach instrument will behave like a spin +4 or a spin —4 
particle), also starts and is completed only when the photon enters the calcite 
crystal (or the spin 4 particle enters the Stern-Gerlach instrument). In fact 
phenomena of everyday occurrence in the physics laboratories practically 
force on us the conclusion that a macroscopic body often exerts a physical 
influence on a photon which is all the time at a finite distance from it. 
Consider Young’s two slit interference experiment. If an opaque plate 
(a macroscopic body) is placed immediately behind one of the slits, the 
whole interference pattern changes, clearly showing that the opaque plate 
is affecting the behaviour of photons passing through the other slit and 
therefore all the time at a finite distance from it. Or imagine light of 
polarization q passing two calcite crystals of equal dimensions placed in its 
path (with some distance between them) and in such orientations that the 
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beam of light emerges from the second calcite crystal as a reconstituted beam 
of polarization q. If we now place an opaque plate between the two calcite 
crystals in such a way as to cut out the ordinary ray, we find that the photons 
in the extraordinary ray now leave the second calcite crystal with 
polarization of the extraordinary ray, not polarization g. The opaque plate 
has changed the polarization behaviour of the photons which all the time 
were at some finite distance from it. Thus the hypothesis that a macroscopic 
body exerts a physical influence on a photon which is within a finite distance 
from it is inescapable, once we try to visualize a beam of light as a beam of 
photon particles. 

The hypothesis that a macroscopic body like a measuring instrument 
exerts a physical influence on a quantum system within a critical finite 
distance from it need not therefore be considered too strange or implausible. 
At any rate this hypothesis is a physical, falsifiable hypothesis and can be put 
to experimental test. 


S. V. BHAVE 
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Review Article 


DISCOVERING AND UNDERSTANDING THE MEANING 
OF PRIMATE SIGNALS* 


This volume, edited by a philosopher and an anthropologist, is a collection of essays 
on the philosophical implications of laboratory and field research. While neither the 
best nor the worst of the genre, it is a collection that offers a representative sample of 
traditional themes. As practicing scientists who view the implications of behavioural 
research from a somewhat different perspective we offer this critical review. 


Intention and Introspection 
Observation and Inference 
Projective Introspection 
Experimental Ethology 
Experimental Primatology 
Cognitive Ethology 
Introspection and Folklore 
Communication and Intelligence 
Concluding Comment 


O wos AM & GH hb 


Scientific research into the intelligent behaviour of human and nonhuman 
animals has been accelerating for decades, and some of the most exciting 
discoveries can be found in laboratory and field studies that reveal the 
variety, the precision, the complexity, the subtlety, the versatility of social 
communication in nonhuman primates. Those hoping to learn about the 
substance of these new developments in a book entitled, The Meaning of 
Primate Signals, will be disappointed. In this volume (henceforth MOPS) 
edited by Rom Harré, a philosopher, and Vernon Reynolds, an anthro- 
pologist, ‘meaning’ rarely, if ever, refers to the information that might be 
conveyed by primate signals. Their concern is not the meaning of what the 
primates say, but whether they mean to say it—whether nonhuman beings 
have intention. 


In the enterprise described in this book the authors are trying to provide a clear 
enough answer to the question of how similar (and how dissimilar) animal intentions 
are to those of humans .. . (Editors, p. 7) 


The attraction of science is the promise of discovery. Great industries and 
great nations spend vast sums in the hopes that further discoveries will make 
them richer and more powerful. Publishers and broadcasters retail news to a 


* Review of Rom Harré and Vernon Reynolds (Eds.) [1984]: The Meaning of Primate Signals. 
Cambridge University Press. £25.00. 
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faithful audience eager to share in the adventure and the profits of scientific 
discovery. The attraction of philosophy is the promise of understanding. 
But, understanding is a subjective experience. What makes us say that we 
understand this or that idea? What makes us say that students understand — 
is it that they can repeat back what they have heard or read, that they can pass 
examinations, that they say they understand? By contrast, those who 
promise adventure, profit, power, must have something external and 
palpable to offer—mushroom-shaped clouds, moon rocks, chess-playing 
machines. Where philosophy can be lofty and noble, science must be crass 
and materialistic. 


I INTENTION AND INTROSPECTION 


The MOPS approach is militantly subjective. Where a scientific inquiry 
would begin with a description of tangible evidence for the existence of 
human intention, the editors and contributors to MOPS begin by assuming 
that human intention is so well-accepted that description is unnecessary. 


We feel relatively safe in assuming that humans intend to communicate certain 
information when they speak, and we generally assume that such intention is 
somehow unique to human language. (Seyfarth, p. 44) 


Introspection, our own experiences and detailed accounts by others of remembering, 
forgetting, making decisions, and so on, give us reasonably direct access to the kind 
of mental images that we may associate with intention, and what this evidence lacks 
in ‘objectivity’ seems more than made up for by vivid clarity of detail. (Quiatt, p. 10) 


By these criteria the Old Testament could compete with the Origin of Spectes 
and Aesop’s Fables with Tinbergen’s Study of Instinct. 

Within the covers of MOPS, the only remaining attempt to describe a 
starting point for the inquiry can be found in the editorial introduction, 


Each of us has personal experience (and therefore knowledge) of the complex, 
cognitive thought processes that go on in his or her head. We know we think, and we 
know our thoughts to be long, convoluted and interminable. We also know that these 
thoughts underlie our actions. They do not perhaps entirely determine our actions 
but they do guide them. 

. . . We have a unique entrée into the human mind: our personal subjective 
experience. From this starting point which we must accept as a valid source of data 
(for if our own experience is not valid then what is?), we can attempt to generalize 
outward into the minds of others. If we could not do this, we could not explain the 
actions of other people. What they do makes sense only in terms of how it compares 
with what we would do in a comparable situation. That is the basis of common-sense, 
or ‘folk’ psychology. (p. 1) 


This is a clear departure from the ordinary rules of scientific evidence. 
First, there is the claim that reports of subjective experience are as valid as 
any other source of data. This is based on the canard that, since all scientific 
observations must ultimately be experienced by human beings—even 
instrument dials and computer printouts must be read by human eyes— 
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then all human experience is scientific observation. The other side of this 
coin is the claim that unbiased observation is a myth because all human 
observation is necessarily distorted by the biased expectations of prejudiced 
observers. Between the two there is much comfort for those who would 
avoid the fatigue of rigorous investigation. There is also no way to account 
for scientific progress except as a series of brilliant insights achieved by 
inspired genius (although there is a school in which scientific progress is 
itself a myth of biased observation). Practicing scientists learn, however, 
that what they are practicing is one of those arts of the possible in which 
there are many degrees of rigour. Fortunately, there is a difference between 
disciplined, publicly verifiable, observation and ordinary subjective ex- 
perience, and several hundred years of modern science have demonstrated 
that the difference is well worth the effort. 


2 OBSERVATION AND INFERENCE 


The second departure from scientific rules of evidence is the MOPS 
confusion between observation and inference. The problem of separating 
observation from inference enters a scientific investigation at several levels. 
A reading of 25 on one instrument and 14:02 on another are possible 
observations, but the statement that a certain chamber reached a tempera- 
ture of 25 degrees Centigrade at 14:02 hours is an inference. Practicing 
scientists devote a great deal of space and effort in their research reports to 
the descriptive apparatus that separates observations from low-level in- 
ferences of this kind. The procedure is known as operational definition and it 
often makes original sources into tedious reading. Without this technical 
language, however, scientific reports become unintelligible, and research 
becomes unreplicable. 

Blurton-Jones [1967], for example, became one of the founders of human 
ethology when he introduced rigorous operational definitions into his 
reports of children playing in groups. Before his influential work, the 
available records defeated attempts at comparison because they presented 
subjective impressions rather than observations. The meticulous detail with 
which Blurton-Jones described each category of behaviour makes difficult 
reading for the dilettante, but it also made new discoveries possible and, 
more important, the verification and extension of these new discoveries by 
other investigators. When observations of rough-and-tumble play, for 
example, could be separated from the several categories usually lumped 
together as aggressive, they correlated not with aggression but with other 
forms of play. When six different smiles were carefully identified and 
tabulated separately, they also appeared in significantly different situations. 
The oblong smile, for example, appeared in fights between children and in 
non-social situations involving danger or risk of impact. In MOPS, Harré 
singles out Blurton-Jones’ research as an example of ‘misplaced positivism’ 


(p. 93). 
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At a higher inferential level, certain terms refer to explanations. 
Explanatory terms cannot refer to observable entities. In Drosophila the 
character of having red eyes is said to be carried by a gene. We can observe 
the red eyes of the parents and the red eyes of the offspring, but we cannot 
observe the gene for eye colour because it is only a term that stands for an 
explanation of the relationship between parents and offspring. Well after the 
modern development of genetic theory, electron microscopes were per- 
fected to the point where scientists could speak of having seen the gene for 
such characters as the red eye of Drosophila. But, this is only a figure of 
speech. No matter how great the magnification, what they see is neither red 
nor shaped like an eye, it is only another correlate of the relationship 
between the red eyes of parents and offspring and it is entirely identified by 
that correlation. 

Like the gene, free will and intention only exist as explanatory constructs, 
they cannot be observed. Harré and Reynolds claim that they can sense their 
own intentions. That they cannot name the sense organ is unimportant. 
What is important is that whatever sensation they may have can only be a 
correlate of their behaviour, just as the object in the photomicrograph is only 
a correlate of the eye colour of parents and offspring. To the extent that 
intention only stands for a correlation between an internal stimulus and 
an external response, it explains nothing. On the other hand, to the extent 
that human beings are explaining their behaviour when they speak of 
intentions—as when we say that we intended to see that movie or write that 
article, or that we intend to see this movie or write this article—their 
explanations can be as false as any other explanations. To that extent, like 
genes or photons or phlogiston, intentions are inferences about nature. 

Inferences about one’s own behaviour are as fallible as inferences about 
any other sort of behaviour and they must be tested by the same rules of 
evidence. The ‘facial vision’ of the blind is a typical case. For hundreds, 
perhaps thousands, of years philosophers argued about whether blind 
human beings can locate objects at a distance. The dispute ended in 1944 
when Dallenbach and his associates placed a 4-ft wide by 7-ft high masonite 
screen at random places across the path of blind subjects. The path ran 
lengthwise through a 60 x 20-ft corridor. The blind subjects were asked to 
signal by raising one arm when they first detected the obstacle, and to stop 
walking and raise the other arm when they were about to collide with the 
obstacle. There were also catch trials in which there was no obstacle. Blind 
subjects were quite good at detecting and locating the obstacle. Next, 
normal-sighted student volunteers attempted the same task while blind- 
folded. They failed at first, but within about 20 trials they, also, could 
detect and locate the obstacle and their performance approached the 
accuracy of the blind subjects with a few hours of practice (Supa, Cotzin 
& Dallenbach [1944]). 

The experimenters asked both the blind and the sighted subjects to 
explain how they accomplished the task. About half of the blind and the 
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sighted subjects were convinced that they felt the object on the skin of their 
faces—their foreheads or their cheeks. This agreed with the consensus of 
blind people who had been interviewed over centuries of introspective 
inquiry into this phenomenon. In fact, the phenomenon had come to be 
known as ‘facial vision’ and that is the term used in the title of the 1944 
article. If these experimenters had been satisfied with ‘the unique entrée 
into the human mind’ posited by Harré and Reynolds, the investigation 
would have ended there as it did for so many of their predecessors. 

Fortunately, the experimental psychologists continued in their external 
investigation of the phenomenon and demonstrated in a series of precise and 
rigorous tests that the subjects located obstacles in their path by listening to 
the echoes of their own footsteps. They could do just as well when their 
heads were covered with hoods, so long as their ears were exposed. When 
their faces were bare, but their ears were stopped, they failed. Eventually, 
the experimenters devised an apparatus that emitted artificial tones and 
picked up echoes with a microphone. They suspended the apparatus from 
an overhead track so that it could be moved through the corridor by remote 
control. Under these conditions, subjects could control the movements 
of the apparatus and locate the obstacles from another room by listening 
with headphones to the sounds received by the microphone (Cotzin & 
Dallenbach [1950]). 

Both the blind and the sighted university students in this experiment had 
learned to locate obstacles in the dark by listening to echoes. They, and most 
of the blind people who down through the ages solved the same problem for 
themselves, could accomplish this difficult task without being aware of 
which sense-organ they were using. They could do it even when their mental 
imagery of facial vision was quite vivid. One of the blind subjects, for 
example, described faint shadows on his face that became clear and sharp as 
he approached an obstacle. That so many agreed in their reports of facial 
images and that they were so firmly convinced of it is a very interesting 
phenomenon. And yet, neither the numbers of those who agreed nor the 
strength of their convictions are evidence for the validity or even the 
relevance of their subjective reports as explanations of their behaviour. 


3 PROJECTIVE INTROSPECTION 


The introspective method has dominated Psychology throughout its 
history. The experimental method is a relatively recent development and 
only replaces introspection in certain fields of Psychology to this day. The 
advantage of the experimental method is that it results in discoveries. The 
results of introspection, however, never transcend ‘common-sense or “‘folk”’ 
psychology’ because that is all they really are. Yet, projective introspection 
(‘to generalize outward into the minds of others’) is the method recom- 
mended by the editors of MOPS for deciding whether animals other than 
man have intentions. Within the covers of MOPS it is also the only method 
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used by both philosophers and primatologists to compare the intentions of 
human and nonhuman primates. It may well be that introspection is the only 
method that supports intentional explanations of behaviour. 

In their contributions, Harré and Asquith carry this projective technique 
a step farther. They give us their impressions of a selected sample of 
ethological writing in order to demonstrate that ethologists intend to 
attribute intention to nonhuman beings, particularly nonhuman primates. 
It follows that the ethologists must have the impression that nonhuman 
primates have intention, from which it follows that nonhuman primates 
must indeed have intention—at least to some degree. 

Asquith presents an elaborate linguistic argument to the effect that the use 
of ordinary language terms in ethological reports entails an intentional 
interpretation. According to Asquith, her arguments apply not only to terms 
for general categories such as ‘threat’, ‘appeasement’, and ‘aggression’, but 
also to more specific terms such as ‘greet’, ‘groom’, ‘look up’, and even ‘jump 
over a fence’. And, this is true no matter how much operational definition 
the ethologist provides. 


Behaviour categories are not simply a shorthand for various movements or 
vocalizations grouped in one manner or another; they mean' something more. That 
is, they imply more than can be ascertained from the movement patterns and 
vocalizations alone. 

... behaviour category terms already have established semantic fields in ordinary 
human discourse in which we normally do wish to impute intentions and feelings to 
the performer. It is my purpose in the remainder of this chapter to suggest the 
process by which this anthropomorphic increment occurs. (p. 145) 


Asquith notes that most ethologists of the past two or three generations 
have explicitly and repeatedly denied that they mean to attribute human 
intentions to nonhuman animals by the terminology of their ethological 
descriptions, or by any other aspect of their ethological writing. She 
maintains, nevertheless, that all ethologists are hopelessly caught in a 
Whorfian web of predetermined meaning and useage. Moreover, 


Besides (and perhaps because of) the fact that many of us mean to ascribe intentions 
to animals in everyday life, we also use ordinary language terms in scientific 
discourse about animals because they are appropriate for such description. Their 
suitability stems from the fact that living creatures are being talked about of whose 
behaviour we feel some intuitive understanding in everyday experience (based, 
perhaps, on a common biological heritage). (p. 153) 


Harré, in his chapter, invokes the same principle in his detailed analysis of 
the ordinary language terms used in a report of baboon behaviour published 
by Bachmann and Kummer. Because he finds both intentional and 
nonintentional terms in the Bachmann and Kummer article, Harré con- 
cludes that it represents a transitional movement in ethology from noninten- 
tional to intentional explanations. Presumably because of its 1980 date (no 


1 Throughout this article, the italics in quotations are the italics of the original authors. 
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other articles, current or past, are analysed), Harré concludes that all of 
ethology is in the same state of transition from an earlier period of 
‘misplaced positivism’ (p. 93) and that, ‘My argument so far has established 
a strong tendency amongst ethologists to use intentional language in 
describing the phenomena they observe’. (p. 99) 

Harré’s linguistic analysis of Bachmann and Kummer’s intention to 
attribute intention to nonhuman minds fails to establish anything general 
about ethologists because of its extremely narrow scope. On the other hand, 
Harré’s resort to such an indirect external route to discover the intentions of 
a fellow human being clearly contradicts the Harré and Reynolds editorial 
stance. If, indeed, each human being has a clear inward view of his or her 
own intentions, and introspective reports must be accepted as scientific 
evidence, then the way to find out what Bachmann and Kummer meant in 
their recent article was to ask them. 

Asking Kummer would have been easy, since Kummer, himself, attended 
the MOPS conference. As a matter of fact, Kummer’s introspective report 
of his meaning was voiced at the conference. In his printed comments 
(pp. 106-107), Kummer politely rejects Harré’s reading of Bachmann 
and Kummer and explicitly denies that Bachmann and Kummer meant 
to attribute intentions to their baboon subjects. He explains in detail how 
the terms were defined within their article by the methods of observation. 
That is to say, the terms were operationally defined as is the practice in 
ethology and other natural sciences—or at least, that was the intention of 
the authors. Within the covers of MOPS, Harré fails to acknowledge 
Kummer’s introspective reports of Kummer’s subjective experience in 
any way. 

Harré and Asquith offer their external linguistic analyses of ethological 
writings as evidence of the intentions of ethologists. Both are asserting that 
this external evidence is superior to the introspective reports of most of the 
ethologists in question. Anyone who rejects introspective reports in favour 
of external analysis, however, must also reject the thesis that each human 
being is the indisputable judge of his or her own intentions. But, within the 
covers of MOPS that editorial thesis is the only warrant we are offered for 
believing in the existence of human intentions in the first place. Meanwhile, 
if the ethologists routinely misread the intentions in their own written 
words, why should we trust them when they attribute intentions to the 
signals of other primates—particularly when these attributions are 
unintended? 

What Harré and Asquith are saying is that ethologists must mean what 
Harré and Asquith say they mean because of linguistic conventions that no 
practicing scientist can break. Oddly, the linguistic determinism that Harré 
and Asquith apply to ethological literature denies freedom and inten- 
tionality in the human use of language precisely where operationism would 
grant so much freedom. They play Whorfian Alice to operationism’s 
Humpty Dumpty. 
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‘I don’t know what you mean by “glory”, Alice said. 

Humpty Dumpty smiled contemptuously. ‘Of course you don’t—till I tell you. I 
meant “‘there’s a nice knock-down argument for youl”? 

‘But “‘glory’’ doesn’t mean “‘a nice knock-down argument”, Alice objected. 

‘When I use a word,’ Humpty Dumpty said, in a rather scornful tone, ‘it means 
just what I choose it to mean—neither more nor less.’ 

“The question is,’ said Alice, ‘whether you can make words mean so many different 
things.’ 

“The question is,’ said Humpty Dumpty, ‘which is to be master—that’s all.’ 


4 EXPERIMENTAL ETHOLOGY 


Operational definition may not be a perfect tool, but it is a sound one and 
empirical discoveries depend on it. Colour vision, for example, is plainly a 
phenomenon of subjective experience; Newton, himself, was quick to 
acknowledge that, ‘the Rays to speak properly are not coloured’. The colour 
is somewhere in the beholder. Although some 2 in 25 European men have 
deficient colour vision, the phenomenon was unknown to the scientific 
world until John Dalton’s report of his own anomalous experience. 
Moreover, that talented scientist was in his middle twenties before he 
realized that his visual world was severely abnormal. In describing his case, 
Dalton was forced to use the colour words that he had learned from the 
normal-sighted community. The result remained confusing and misleading 
on many points until the arrival of modern colorimetry and external 
definitions of colour vision. 

How about other animals, do they also have colour vision? According to 
Harré and Reynolds, 


Ethology has added a vast richness to this [psychology’s] experimental paradigm. 
We do not need to isolate an animal or to control the input side; we can, by patient 
observation in natural conditions, discover that certain environmental features are 
attended to by an animal, and that certain predictable responses follow. In general, 
the relation between the stimulus and the response is adaptive: it is conducive either 
to the survival or to the reproductive success of the animal. So we can assume that in 
the head of the animal there are neural programmes that assess environmental 
features and selectively activate responses appropriate to them. (p. 5) 


This quaint description of ethologists as patient observers of nature is a 
picture of the naturalist that may have been offered to school children two or 
three generations ago. We hope that most modern readers are better 
informed. Tinbergen and Von Frisch and their students would be the first to 
be ruled out of ethology by the MOPS definition. These discoverers are 
known for their intense commitment to experimentation under natural 
conditions and for the elegance and rigour of their experiments. This is what 
set them apart from earlier field naturalists and why they are recognized as 
the founders of a new field. 

Early naturalists did, indeed, examine the habits and neurology of 
different species to see whether these were similar enough to the human case 
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to suppose that this or that nonhuman being might see the world in colour, 
as humans see it. The habits of honey bees, for example, seem to indicate 
that they distinguish flowers on the basis of colour, but the anatomy and 
neurology of their eyes and brains are far from human. It was argued, on the 
one hand, that honey bees only seem to be distinguishing the colour of the 
flowers and that simpler explanations could account for their behaviour. 
Some argued that the apparent colour vision of the honey bee is only another 
case of anthropomorphism in the long history of animal behaviour. Others 
argued for the simplicity of colour vision itself, that it might not require 
organs as complex as the human eye and brain. There the matter stood, 
perennial fuel for debate, until the rigorous and precise experiments of Von 
Frisch early in this century. 

Von Frisch showed that bees not only discriminate colours, but that they 
sort them into groups of primary colours the way human beings do, 
although they may group them more crudely than humans. He was also able 
to show that the spectrum of colours visible to the honey bee is shifted away 
from the red and into the ultra violet; they confuse red with black and see 
ultra violet as a distinct colour. Plainly, the early natural philosophers could 
never have achieved these discoveries by the projective introspection 
prescribed by Harré and Reynolds. Von Frisch’s work depended upon 
an externally verifiable way of observing colour vision—on operational 
definition. 

Totally colour blind human beings can still discriminate differences in 
wavelength. This is because the differential sensitivity of the retina makes 
lights of different wavelengths appear brighter and dimmer even when they 
are of equal intensity. Thus, a cat that has learned to push a red panel rather 
than a green panel to get food has not yet demonstrated that it has colour 
vision because human beings can do as well without colour vision. We must 
show that the cat, like human beings with normal colour vision, can continue 
to choose consistently in the face of extreme variations in the relative 
intensity of the two panels. Nonhuman subjects often fail the test at this 
point, but that does not mean that they lack colour vision. It only means that 
up to that point they were using relative brightness, and it worked. The 
experimenter must persist in the testing to see if the subjects can recover 
from this confusion. There are notable examples of nonhuman animals 
declared to be colour blind by experimenters who gave up too soon. Cats, for 
example, were declared to be colour blind by several different sets of 
experimenters. It was only in quite recent times that more ingenious 
experimenters, who saw that the standard rat reinforcement procedures 
were inappropriate for cats, were able to demonstrate that cats do indeed 
have colour vision. Von Frisch’s success with the bees also depended on his 
informed and thoughtful adaptation of testing procedures to the animal 
under investigation. 

Seeing the world in colours is more than the ability to discriminate 
between wavelengths; it is the ability to identify and group objects according 
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to colour. In the experiments that have satisfied this requirement, animals 
showed that they could remember the colours of things. Honey bees can 
remember more than the colour of flowers, they can also remember the 
shape of flowers (Gould [1985]). Thus, operational definitions derived from 
careful experiments permit us to speak with some confidence about the 
pictorial representation of flowers in a brain as alien to ours as the brain of a 
honey bee. 

Further experiments show us that when they return to the hive, these 
simple creatures with their tiny brains routinely communicate to other bees 
about distant sources of food. Scouts do not have to pinpoint targets for 
robotic followers, but rather indicate a promising area of forage. Followers 
are free to take advantage of other targets that may present themselves, as 
when a change of wind wafts across the path the scent of richer or closer 
forage. If there is an obstacle, such as a hill, the followers can reckon the 
detour into their flight path. 

When the bees swarm, they form a mass which clings to a convenient 
branch while scouts search out a favourable site for the new hive. Returning 
scouts dance on the surface of the mass to communicate their findings. 
Sometimes their reports disagree. ‘Prime location, 100 metres due South’, 
dances one scout. ‘Spectacular site 75 meters North by Northwest’, dances 
another. Each rival recruits followers who reconnoitre the proposed sites for 
themselves and return to join the debate. The swarm remains until there is 
consensus, with disastrous results should the debate go on too long. 


§ EXPERIMENTAL PRIMATOLOGY 


Rigorous experimentation under natural conditions is, of course, much 
easier to accomplish with stickleback fish or honey bees than with more 
exotic subjects such as vervet monkeys. For a very long time, nearly all 
scientific information about nonhuman primates was based on short-term, 
naturalistic observations in the wild and on intensive studies of caged 
subjects in the laboratory. Rigorous laboratory experiments led to extensive 
discoveries of the intelligence of monkeys and apes, particularly the line of 
experimentation on exploratory behaviour and learning sets introduced by 
Harry Harlow and his associates in the 1940’s. Extensive as this literature 
has been, and as relevant as it is to so many of the discussions in MOPS, one 
gets the impression that it is virtually unknown to the contributors. The 
suggestions for future research that appear in the editorial epilogue assume 
that no such literature exists. 

During the early period there were also many reports of naturalistic 
observations of captive groups confined under impoverished and artificial 
caged conditions. There is a great contrast between this line of research, 
which included neither experimentation nor anything approaching natural 
conditions, and the mainstream of ethological research. It was clear to many 
ethologists and comparative psychologists that the conditions of confine- 
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ment distorted the picture of behaviour that was emerging from these 
observations of captive monkeys. Particularly suspect was the frequency 
and intensity of agonistic behaviour and the severity of aggression. These 
suspicions were confirmed by the later, intensive, long-term studies in the 
wild. l 

Doubts about the validity of studies of caged groups of monkeys were 
widely expressed from the start. Moreover, the modern field work that has 
been appearing since the early 60’s is by now extensive and widely-known 
(cf. any general work on animal behaviour, e.g. Barnett [1981], pp. 353-381). 
Nevertheless, the early approach is heavily represented in MOPS. For his 
contribution, Reynolds culled from his 1962 Ph.D. dissertation on a captive 
colony of rhesus monkeys a fragment concerned with observations of 
pathological aggression obviously induced by caged conditions. The level of 
aggression was severe enough to lead to the death of one monkey and the 
physical wasting away of another. Reynolds presents his outdated and 
fragmentary observations as evidence that, 


... we are left with a process at the heart of rhesus social life that is rather automatic. 
It is an evolutionary product, an innate neural response. ... If Henry had been 
‘thinking’ before his attacks on Anne, it is likely either that he would have ‘decided’ 
not to attack at all, or that he would have moderated his bites. (p. 219) 


Modern investigators spend many months, often years, in continuous 
observation of naturally occurring groups of monkeys and apes living in 
their natural habitats. While the functional significance of calls and gestures 
can be deduced from adventitious observations of correlations between 
signals, responses, and external events, modern standards of analysis 
require more conclusive evidence. There must be some experimental 
manipulation. A good procedure is to record different types of vocal signals 
and play them back experimentally in neutral situations. 

‘Thus, among the calls of the vervet monkey observed in its natural habitat 
in Africa, are three different alarm calls that have been correlated with 
the appearance of three different predators, leopards, martial eagles, and 
pythons. Three different sorts of defensive response are correlated with the 
different types of predator. The technical problems involved in playing back 
suitably faithful recordings of these calls under field conditions and at the 
same time obtaining suitably faithful records of response are enormous. 
Seyfarth and Cheney must be praised as pioneers who overcame so many 
technical problems to bring back experimental evidence of the referential 
value of different vervet calls. 


6 COGNITIVE ETHOLOGY 


Seyfarth and Cheney present a somewhat truncated description of their 
research programme in MOPS. Rather more complete and informative 
descriptions appear elsewhere in the professional literature. They have used 
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the space saved to air in MOPS some rather specious and overblown 
speculation about the internal mental life of the vervet monkey. 


We may further hypothesize that the monkey gives an alarm call because it wants 
others to believe that there is either (a) something interesting nearby, (b) a predator 
nearby, or (c) a specific kind of predator nearby. All of these would be second order 
intentionality. To date, field observations suggest that explanations (a) and (b) can be 
eliminated, leaving explanation (c) as that most strongly supported by existing data. 
Throughout such an exercise, the main purpose of Dennett’s scheme is to answer the 
question ‘How complex must our human terms be to describe a particular pattern of 
behavior? When we compare different sorts of communication, which require the 
most complex terms?’ (Seyfarth, p. 41) 


Dennett’s proposal, as presented by Seyfarth, represents an ancient 
alternative to parsimony. A modern biologist must ask what is added by 
introducing the term ‘believe’ into this account of vervet behaviour. The 
observation that vervet monkeys tend to take to the trees or stay in the trees 
when they hear call X and leave the trees or stay in ground cover when they 
hear call Y is hardly grounds for anyone, even the caller vervet, to infer that 
the hearer believes anything. To an outsider it is certainly thinkable that a 
vervet monkey could believe that there is an avian predator nearby and stay 
in the trees anyway, or contrariwise, stay on the ground while believing that 
a leopard is nearby. It is only if we, and vervet alarmists, conclude that any 
vervet who believes that there is a leopard nearby will take to the trees or stay 
there, that the term is warranted at all here. But, in that case, we can also say 
that the follower bees believe that food is in the location signalled by the 
scouts, that spiders build webs because they believe that the webs will catch 
flies, and that laboratory rats (when offered a variety of foods) eat a balanced 
diet because they believe it is better for their health. 

Suppose we retreat to the position that the alarmist vervet emits call X 
because it wants others to be in the trees—thus avoiding the need for any 
human, or vervet, assumption that vervets believe anything? The trouble 
now is that, to a nonvervet primate, it is certainly thinkable that a vervet 
could emit alarm call X, without caring in the slightest, what the others do— 
or even, while inwardly hoping that the leopard will eliminate some hated 
conspecifics. The alternative is to conclude that vervet monkeys must want 
the observed results of their calls. Why else would they call? But then we 
must also conclude that the scout bees (who also signal only when there is an 
audience) want the followers to find the food, that spiders want to catch flies, 
that rats want a balanced diet, and so forth. 

It is now time to ask what is gained by introducing the term ‘want’ in the 
first place. What is gained by saying that the alarmist vervet emits call X 
when it wants to emit call X, and call Y when it wants to emit call Y, and so 
on, if all we are saying is that when a vervet emits a call the reason must be 
that it wants to emit that particular call. By a similar line of reasoning the 
action of pumps was once explained by nature’s hatred for vacuums. It could 
also be said that Boyle’s Law works because nature wants it to work (or hates 
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for it not to work). Modern scientists have come to understand that 
explanations of this type are always empty. Explanatory terms such as ‘want’ 
and ‘believe’ fail, not because they are anthropomorphic, but because they 
are post hoc. Because they can only furnish explanations after the fact, they 
can never lead to discovery. They are equally post hoc when applied to 
human behaviour. 

Seyfarth uses the term ‘mental image’ in a similar fashion. 


When one person uses a word in speaking to another, three important events take 
place. First, the speaker creates a sound with his vocal organs. For all those who share 
the speaker’s language, this sound represents or ‘stands for’, a relatively specific set 
of information. Second, the speaker has some mental image of an object or concept as 
well as the intent to communicate this image to another. Third, the listener 
assimilates the information conveyed by the sound, and responds to it. (p. 43) 


Cheney and Seyfarth claim that their evidence for vocal communication 
among wild vervets entails the existence of vervet imagery. The pitfalls of 
this line of reasoning were brought home to us by a cognitive psychologist 
who visited us recently in our laboratory. Our visitor asked us if Washoe had 
mental images of our mental images of what she signed to us. The hospitality 
of the desert being what it is, we only responded with some suitably polite 
vagaries. Our visitor took our response for an affirmative answer to his 
question and then pointed out to us that that meant that we had an image of 
Washoe’s image of our image. He further pointed out that, his ability to say 
what he had just said, meant that he had an image of our image of Washoe’s 
image of our image, and our ability to understand what he was saying meant 
that we had an image of his image of our image of Washoe’s image of our 
image. Moreover, our visitor had a mathematical theory of images in which 
each image of an image represented a new level of imagery. The theory could 
only cope with six levels of imagery, at least at the time of that visit. 

Approaches that confuse observation with inference and defy the rule of 
parsimony, lead to infinite regress rather than to discovery. A perennial 
argument for the role of imagery in human thought that is favoured in 
MOPS (cf., p. 12) concerns the reports of vivid imagery by famous 
scientists, such as Einstein. What do such reports, usually long after the fact 
and often second or third hand, tell us about the function of imagery? Is it 
that all human beings have these images, but only the Einsteins of the world 
have the genius to grasp their significance? In that case, the problem of 
scientific genius remains unchanged. At best, the image becomes a readout 
of perception and memory, tangentially and trivially related to scientific 
genius, Is it instead, that scientific geniuses have richer images than those 
granted to the ordinary run of humanity? If so, where do these superior 
images come from? Is it the gods, or the Einsteins, themselves, who summon 
up the images of genius? If it is genius, itself, that does the conjuring, why 
then it is in preimagery, in the part of the brain that creates images that we 
must look for the roots of genius. Introspective reports of imagery become 
distant, distorted echoes of some tangential process. 
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Many of the contributions to MOPS depend upon variants of Griffin’s 
model in which intention must be based on imagery. In this view, the 
individual (whether human or nonhuman) must compare images of the 
results of various courses of action in order to choose among them. After 
studying images, I,, Ip, . . . Iy, the individual chooses and then embarks on 
course, Cy, Cp, ... or Cy, accordingly. The trouble here is the same as 
before. If all individuals have roughly the same images, then images are only 
a form of readout, and choice depends upon the weighting of alternatives in a 
later stage of processing. In that case, we must explain how this part of the 
brain chooses the weights without first forming images of the results of 
different sets of weights, and so on. If, instead, the individual can tamper 
with the images, themselves, say by adding a little more attractive colour to 
I,, or by refusing to summon up I, and Jp, why then the critical function is 
carried out at an earlier stage. Only now we have the problem of explaining 
how this part of the brain can choose to colour or censor images without first 
forming preimages of the various sets of tampered and untampered images, 
and so on. 


7 INTROSPECTION AND FOLKLORE 


How do we distinguish introspective accounts of behaviour from folklore? 
In some societies there are individuals, called shamans, whose behaviour sets 
them apart. They may, for example, assume unusual postures, froth at the 
mouth, and talk in strange voices. Afterwards, the shaman typically reports 
that he was possessed by a spirit, names the spirit, and recognizes the 
difference between episodes of possession and ordinary experience. The 
possession gives the shaman powers to influence the health and well-being of 
others in the community who thus share in the possession. Modern 
anthropologists and psychologists cannot deal with this phenomenon if they 
are forced to accept introspective accounts as literally true. They must be 
free to doubt the shaman’s claim that he was indeed possessed by spirits. 
They must even be free to doubt the claim that there is any such thing as a 
spirit in the first place. The same argument applies to the traditional 
mythologies of wants, beliefs, images, and so forth, that seem so firmly 
established in Western folklore. If there is any lesson to be learned from the 
history of science it is that discovery depends upon the ability of scientists to 
raise doubts about the traditional folklore and to demand external evidence. 

Structural linguists, such as. Chomsky, insist that human beings under- 
stand and parse novel sentences because human beings claim to have these 
abilities. External evidence that human beings are inconsistent in their 
responses to novel sentences is rejected on grounds that such evidence 
represents performance rather than competence. Nonlinguistic factors such 
as perception and memory are said to intervene between competence and its 
outward expression in performance. Thus, we are asked to accept the claim 
that human beings can understand and parse for the simple reason that 
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introspective reports contain this claim. All evidence is ruled out, even a 
vote to determine how many human beings subscribe to the claim. To 
practicing scientists, such a position is intolerable. All hope of discovering 
anything new about meaning and language must be abandoned in a 
discipline that cannot question the prevailing folklore. All claims must be 
scrutinized, including claims that undérstanding and grammar exist apart 
from the folklore. Imagine a physics or a chemistry, or even an economics or 
a history, in which folklore had to be accepted in favour of empirical 
evidence. 

In this connection, an anthropological study of structural linguistics 
would have been helpful. Consider de Civrieuz’s recent account of the 
Makiritare who live deep in the forests of a mountainous region of 
Venezuela. They were called Makiritare by the Arawak-speaking guides 
who were with the Spanish when, in 1759, the first recorded contact was 
made with this remote people. The Makiritare call themselves, So’to. In 
their language ‘So’to’ means simultaneously the language and those who 
speak the language. Those who do not speak it are regarded as nonhuman— 
enemies who can be hunted as animals. But individuals from other tribes, 
usually women and children, who find themselves among the So’to can 
become So’to as soon as they learn the language. A similar story can be told 
about many human groups. And, if the story seems exotic to you, consider 
how little time has elapsed since the days when ‘English’ or ‘Frangais’ could 
have been substituted for ‘So’to’. The identification of language with 
humanity has deep tribal roots. 


8 COMMUNICATION AND INTELLIGENCE 


The notion of unbridgeable gaps between verbal behaviour and the rest of 
human behaviour and between human intelligence and the rest of animal 
intelligence becomes ever more difficult to defend. Advances in the study of 
human communication are rapidly filling in the traditional gaps. The more 
we learn the more difficult it becomes to specify a gap between prelinguistic 
communication and language proper, between holophrase and sentence, 
between gesture and word, between sign languages and spoken languages, or 
even between action and language. Meanwhile, sign language studies of 
cross-fostered chimpanzees have revealed further dimensions of continuity. 
Some discussion of recent discoveries in human communication would 
seem to be essential for most of the topics covered in MOPS, but the subject 
is virtually ignored by both editors and contributors. Sign language studies 
of chimpanzees are brought up throughout the book, perhaps more 
frequently than any other topic. Neither editors nor contributors, however, 
seem to be familiar with scientific (as opposed to journalistic) literature in 
this rapidly developing field. Lieber was chosen to comment on this topic 
from the point of view of structural linguistics, but his comments are made 
without reference to any of the scientific developments of the last 15 years of 
KK 
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a field that was less than 20 years old when he was writing. He writes, for 
example, as if totally unaware that cross-fostered chimpanzees use the 
inflections of American Sign Language (ASL) very much the way deaf 
human children do. Nevertheless, published evidence of this has been in the 
literature since 1978. 

Project Washoe was replicated and extended with the chimpanzees Moja, 
Pili, Tatu, and Dar. The existence of the replications together with all of the 
more advanced developments in Project Washoe are totally ignored in 
MOPS. ‘Terrace’s contribution to MOPS covers only those aspects of 
Project Washoe that Terrace replicated with the chimpanzee Nim. Since 
Terrace failed to replicate most of Project Washoe either in concept, 
execution, or findings, his review of the field is severely truncated. 
Unmentioned are: the experimental evidence for communication of natural 
language concepts, the development of ASL inflections, the abundant 
records of conversation between chimpanzees including records made with 
remote video equipment and without any human presence, and the infant 
chimpanzee Loulis learning signs from his adopted mother Washoe, to cite 
just a few major findings.* 

` ASL is highly inflected and makes relatively little use of sign order, but 
this was the only grammatical device that Terrace studied because, ‘sign 
order is one of the easiest, if not the easiest, grammatical devices [sic] of sign 
language to record’ (p. 181). Several thousand of Nim’s utterances were 
recorded by assistants over a period of about two years. Stripped of all 
context, verbal or nonverbal, without notes as to who was addressed, what 
was present, or when in Nim’s development they were uttered, all items 
were fed into a computer and examined for statistical regularities. The 
statistics that emerged from this exercise proved to Terrace that the 
sequences were not random and could not have been memorized, but 
yielded no evidence that the sequences were used grammatically. 

How could such an analysis demonstrate any meaningful use of sign 
language? The only meaningful way to use sign order or word order would 
be to vary order with context. Two utterances such as Nim tickle and Tickle 
Nim might show meaningful use of order if they occurred in different 
contexts. In Terrace’s analysis they could only cancel each other, and if one 
occurred significantly more often than the other, the most that we would 
know would be that the distribution was not random. 

Operational definition grants something less than complete freedom, after 
all. To be useful, a report that a particular chamber reached a temperature of 
25 degrees Centigrade must include an operational definition of the 
procedure used to arrive at that number. The interpretation of the 
observation depends entirely upon the relation of that procedure to 


1 For comprehensive reviews of discoveries in this field during the past 15 years see Fouts, 
Hirsch and Fouts [1982]; Gardner and Gardner [1978, 1985]; and Van Cantfort and Rimpau 
[1982]. 


Discovering and Understanding the Meaning of Primate Signals 493 


generally used and generally useful ways of measuring temperature. To the 
extent that the relation is remote or nonexistent, the results of the 
experiment are irrelevant to other measurements of temperature. Terrace’s 
statistical analysis of Nim’s sign order is negligibly related to any other 
empirical or theoretical work on the use of sign order or word order as a 
grammatical or even a communicative device in humans or chimpanzees. 
Moreover, it is plain that, if the same procedure were applied to the use of 
sign order or word order by human children, or by human adults for that 
matter, the result would be equally negative. 

To the extent that Terrace does give us operational definitions, rather 
than subjective impressions, his procedures and results can be compared 
with other research and reinterpreted. Such reanalysis, including further 
discussion of operationally useful ways of comparing the signed and spoken 
utterances of chimpanzees and children, can be found in Van Cantfort and 
Rimpau [1982]. 

One of the chief inflectional devices of ASZ is the combination of 
elements of two signs into one sign. Because sign languages are visual and 
spatial, the elements are combined simultaneously rather than sequentially 
as they are in spoken languages. Both human children and cross-fostered 
chimpanzees adopt this mode of inflection when they are quite young. Yet, 
in his analyses of Nim’s signing, Terrace systematically eliminated all 
simultaneous combinations. The reason given is that, when he could not 
tell which sign came first, he could not tell whether a combination was 
grammatical. Again, in his readings of selected portions of the Washoe film, 
Terrace attributes his own inability to distinguish one sign from another and 
his own ignorance of the pragmatics of ASL conversation to imaginary 
errors of Washoe’s. Terrace’s comments on ASL typically illustrate how 
little he, himself, knew about the language he was supposed to be teaching to 
Nim (cf. Van Cantfort & Rimpau [1982], pp. 40-47). 

The few thoughtful and thought-provoking remarks that we could find in 
MOPS were all contributed by Harris, as in the following comment on 
Terrace’s preoccupation with word order as the test of grammar. 


... such assessments are based on rather nebulous views about ‘the essence of human 
language’, appealing in particular to the slippery and far from perspicuous concept of 
a ‘rule of grammar’. Anyone who thinks that linguists are in fundamental agreement 
about what a rule of grammar is ought to acquaint himself at first hand with the 
diversity of opinion which is evident on that subject in linguistics in recent years. ... 
Even within particular schools of linguistics, quite conflicting views of what a rule of 
grammar is are to be found. So claims that apes have not shown any ability to operate 
with ‘rules of grammar’ are vacuous in the absence of any clear explication of what 
rules of grammar are. (pp. 204-205) 


Perhaps the essence of language has been so elusive to so many for so long 
because it is, in fact, a will-of-the-wisp. There may be no substance after all 
to the traditional belief in an unbridgeable chasm between Language and the 
rest of the natural world. The unbridgeable chasm between organic and 
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inorganic also depended on a vital but intangible essence. The history of 
science has not dealt kindly with the great discontinuities. Eventually 
recognized as conceptual barriers rather than natural phenomena, most have 
been passed by and abandoned rather than broken through in the course of 
scientific progress. We would predict a similar future for the traditional 
discontinuity between syntax and semantics. 

To a Darwinist, if human verbal behaviour requires any significant 
expenditure of biological resources, then it must confer selective advantages 
on its possessors. In order to confer any selective advantage, however, a 
biological trait must operate on the world in some way; it must be 
instrumental in obtaining benefit or in avoiding harm. If clarifying one’s 
ideas confers selective advantage it must be because in some way clarified 
ideas provide superior means for operating in the biological world. As for 
establishing social relations, a system of displays and cries is sufficient to 
maintain group cohesiveness in most animals. The selective advantage of 
one system of communication over another would seem to be the communi- 
cation of more information. But, unless verbal behaviour refers to objects 
and events in the external world, it cannot communicate information and it 
cannot have any selective advantage. Thus, reference is the Darwinian 
function of verbal behaviour, and the function of grammar or structure in 
verbal behaviour must be to enlarge the scope and increase the precision of 
reference. 


9 CONCLUDING COMMENT 


Darwini’s triumph was a lawful system that could account for the marvellous 
variety of living forms without invoking the intervention of arbitrary 
supernatural forces. It is an approach to biology that leads to experimental 
questions. Experiment, in turn, leads to discovery, and Darwinism prevails 
throughout biology because it has been a springboard for so many 
discoveries. Nevertheless, where determinism of blood and bone was 
rapidly accepted, determinism of thought and feeling is resisted to this day. 
We see in MOPS and other works of its kind that A. R. Wallace’s alternative 
is still well-entrenched. 


The Darwinian theory, even when carried out to its extreme logical conclusion, not 
only does not oppose, but lends a decided support to, a belief in the spiritual nature of 
man. It shows us how man’s body may have been developed from that of a lower 
form under the law of natural selection; but it also teaches us that we possess 
intellectual and moral faculties which could not have been so developed, but must 
have had another origin; and for this origin we can only find an adequate cause in the 
unseen universe of Spirit. (Wallace, 1889, p. 478) 


Consistent with the emphasis on spirit over substance, we seein MOPS a 
striking lack of interest in what animals do. Repeatedly, large bodies of 
pertinent research literature are overlooked as if never consulted. This is 
particularly noticeable in the frequent editorial calls for starting research 
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into already well-studied fields. We can only recommend MOPS to those 
who have an equal lack of curiosity about recent discoveries in animal 
communication and animal intelligence, and to those who long for a return 
to a preacientific psychology in which nature could be understood by gazing 
within and contemplating the soul. 


R. ALLEN GARDNER and BEATRIX T. GARDNER 
University of Nevada, Reno 
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Reviews 


HENDRY, JOHN [1984]: The Creation of Quantum Mechanics and the 
Bohr—Pauli Dialogue. D. Reidel Publishing Company. xi+ 177 pp. 
(ISBN go-277-1648—X). 


‘So far as I am aware,’ writes the author in the preface to this interesting and 
provocative book, ‘. . . this is the first [history of quantum mechanics] to 
incorporate the results of the large amount of detailed scholarly research 
completed by professional historians of physics over the last fifteen years. It 
is also, I believe, the first since Max Jammer’s promising study of fifteen 
years ago to attempt a genuine ‘history’ as opposed-to a mere technical report 
or popular or semi-popular account.’ 

In so far as it purports to be a scholarly history of quantum mechanics up 
to 1927, however, the book has some very curious features, as the reader will 
soon discover in the Introduction (Chapter 1). The treatment is selective 
and indeed quite intentionally lopsided; the author has relatively little to say 
in the book about the development of wave mechanics. It is in fact nothing 
like what one expects a comprehensive historical account of quantum theory 
up to 1927 to be. Imagine such a book with an opening chapter devoted 
chiefly to Pauli’s critique of Weyl’s 1918 unified field theory. Planck’s 
seminal 1900 work on the black-body radiation, Einstein’s 1905 light 
quantum hypothesis and Bohr’s 1913 treatment of the hydrogen atom, for 
example, naturally appear in the book, and on occasions their significance is 
discussed at some length. But the details of and the connections between 
these fundamental landmarks in the history of the old quantum theory are 
assumed to be familiar to the reader. So despite the fact that the author 
attempts, within the limits of decency, to conduct the historical discussion at 
anon-technical level, it would hardly serve as an ideal introduction for, inter 
alia, ‘physicists and physics students’ who have undertaken little prior 
reading in the subject. 

The book is really a sophisticated historical case study in the epistemology 
and methodology of modern physics. The historical context is the develop- 
ment of the matrix mechanics through to the statistical transformation 
theory, and its physical interpretation. One of the things Hendry is 
attempting to do, as he states in the Introduction, is ‘to approach the history 
of the theory of quantum mechanics as a means to exploring its philosophy’ 
(p. 4). By considering in detail broader epistemological debates between the 
theoreticians involved in the birth of matrix mechanics, and concerning the 
nature of the fundamental concepts employed in physical theories, ‘it should 
be possible to place the interpretative problem in a somewhat clearer light 
than hitherto...’ (p. 4). ; 


498 The British Journal for the Philosophy of Science 


The study is divided into two parts. The first (Chapters 2—5) covers the 
historical period between the 1913 Bohr atom and the emergence of 
Heisenberg’s ‘new kinematics’, prior to its mathematical refinements in the 
hands of Born and Jordan. The conceptual issues most hotly debated in this 
period were wave-particle duality, energy conservation, and to a lesser 
extent, causality. Theoreticians’ attitudef towards these issues reflected 
their prior, and often evolving, epistemolo;tical and methodological persua- 
sions. According to the author, of considerable relevance here was the 
concomitant debate concerning the status of unified general relativistic field 
theories, and in particular the possibility of reducing discrete, elementary 
charged particles to a pure field (continuum) formalism. The link between 
the two debates was largely provided by Pauli, and the author attempts to 
show how Pauli’s fundamental views in the philosophy of physics grew out 
of his critique of unified field theory. In the Introduction, Hendry describes 
the ‘evolving theme’ of this first part of the book as ‘one of a debate, its 
origins in the quest for a unified general relativistic field theory, between the 
established master, Niels Bohr, and the up-and-coming Wolfgang Pauli’ 
(pp. 4-5). Yet although much is said here regarding the development of 
Bohr’s philosophy of the microworld up to and immediately after his 
notorious 1924 virtual oscillator theory with Kramers and Slater, the 
relevance of the Bohr—Pauli dialogue is not really evident until the end of the 
second part of the book. It seems to me that of equal, if not greater, 
importance in this first section is the treatment of the Hetsenberg—Pauli 
dialogue. What emerges therein is a Heisenberg who, in the ever-increasing 
measure that he yields to Pauli’s methodological maxims, gets closer to 
solving the quantum puzzle. This first part of the book culminates in the 
appearance of Heisenberg’s epochal 1925 paper on the new kinematics, in 
the preparation of which ‘he had adopted Pauli’s phenomenological 
approach and . . . his operational ideas as well’ (pp. 66). 

The second part of the book (Chapters 6—10) covers the growth of the 
matrix mechanics, the statistical transformation theory and the develop- 
ment of the general, Hilbert space formalism of quantum mechanics. (Some 
discussion is given to de Broglie’s thesis, the emergence of Schrédinger’s 
wave mechanics, and their reception by those working in the discrete, 
matrix formalism, but it is always the latter that dominates the story.) These 
formal developments are tied in with the parallel programme of providing a 
consistent ‘fundational framework’ to the theory, that is to say, the attempt 
to establish a coherent physical interpretation of the formalism. Hendry 
wishes to show that all these developments, including the emergence of 
Heisenberg’s uncertainty principle in 1927, ‘took place within the frame- 
work of Pauli’s ideas’ (p. 5). Finally, there is a discussion of how Bohr’s 
principle of complementarity then emerged as a compromise between the 
earlier conflicting foundational views espoused by, primarily, Heisenberg 
and Pauli on one side, and Bohr (and to some extent the defenders of the 
wave formalism) on the other. 
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Taken as a whole, the study is an important contribution to the literature 
on both the history and philosophy of quantum mechanics. Although I am 
doubtful whether the author succeeds in all that he sets out to do, as I will 
indicate later, I think the book has considerable value as an up-to-date and 
thought-provoking reassessment of certain chapters in the history of 
quantum theory and its philosophy. As evidenced by his excellent, earlier 
publications in the field, Hendry brings to the study a profound sensitivity 
and understanding bearing on a wide range of historical and foundational 
issues, especially in relation to the old quantum theory. The treatment is, in 
general, both erudite and lucid (and at times quite eloquent), and the author 
is for the most part extremely sure-footed in dealing with both the detailed 
physics and the niceties of the philosophical issues involved. 

Two minor episodes when that sure-footedness is less apparent are the 
following. In Chapter 2 (‘Pauli and a Unified Field Theory’), Hendry cites a 
passage from Pauli’s famous 1g21 survey of relativity theory, in which 
continuum theories are criticised for incorporating a non-operational, and 
hence (in Pauli’s view) meaningless notion of the electric field strength in the 
interior of the electron (or proton). The idea is straightforward: the field 
strength is defined operationally in terms of a reaction suffered by a test 
particle, and there are no test particles smaller than an electron. Hendry’s 
own construal of Pauli’s argument is this (p. 14): 


According to Pauli, any attempt at a complete unified theory would have to account 
for the internal structure of particles, and would therefore have to incorporate 
complex but well-defined properties of matter independent of and prior to those of 
the electromagnetic-gravitational field, which were therefore operationally meaning- 
ful only on a scale large compared with that of the elementary particles (my italics). 


I find this rendering of the argument puzzling; in particular, the second 
‘therefore’ is quite misleading. I mention this point, not because anything of 
fundamental importance hangs on the wording in this passage, but because 
Pauli’s argument plays such an important role in the book (as we shall see 
below), and hence deserves a better gloss in it. 

The second episode concerns the emergence in 1926 of Born’s proba- 
bilistic interpretation of the wave formalism, and particularly Born’s views 
on the possibility of ‘hidden phases’ (hidden variables in today’s vernacular) 
which would provide a complete description of individual scattering events. 
In Chapter 7 (p. 92), Hendry refers to ‘Born’s insistence upon acausality, 
resting as it did on the non-existence and non-observability of further 
microscopic coordinates [i.e. hidden phases]. . > And yet on pages 89 and 90 
ample evidence is given of Born’s view that the possibility of the existence of 
such hidden phases could mot be ruled out; Born’s point appears to be simply 
that they could not be observed, and so had no practical value in the theory. 
(I might add here that given the emphasis in the book on Pauli’s role in the 
development of quantum mechanics, it is disappointing that Hendry does 
not comment on Wessels’ 1980 study which concludes that it is really Pauli 
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who should be attributed with the discovery of what is normally understood 
as Born’s interpretation of the #-function.) 

I might also mention here two very minor historical errors. On p. 85, 
Hendry refers to an ‘ambiguity’ in de Broglie’s early work on matter waves, 
which in a footnote is connected with the point that de Broglie ‘emphasises 
the priority of the wave picture, although his whole theory had been based 
upon the primacy of the particle concept for light’ (footnote 6, p. 157). In 
fact, the strict corpuscularity of light was no more a feature of de Broglie’s 
early scheme than that of electrons was (see Brown and Martins [1984], 
Section II). Later, in another footnote describing the views expressed 
individually by the members of the camp opposed to the new Copenhagen 
orthodoxy during the 1927 Solvay conference (composed of de Broglie, 
Schrödinger, Einstein and Lorentz), we read: ‘Einstein was already talking 
in terms of a statistical or ensemble interpretation as being all that the theory 
could unambiguously support (from a completely opposite point of view 
that came very close to Dirac’s position)’ (footnote 74, p. 164). But Einstein 
did not at this point deny that various interpretations were possible; 
however, in order to avoid the introduction of action-at-a-distance, he 
expressly advocated (as did de Broglie by 1927) a view in which particle 
trajectories were always well-defined, and it is doubtful that Dirac would 
have entirely accepted this (see the discussion in Jammer [1974], pp. 
115—117). 

I now want to turn to several of the major themes in the book. The first 
concerns the above-mentioned role of the debate surrounding the unified 
field theory, which ‘does not . . . feature in the existing accounts of the 
development of quantum mechanics’ (p. 7). Hendry reminds us that besides 
Pauli, a number of theoreticians contributing to the development of 
quantum mechanics had been concerned with unified field theory: 
Sommerfeld, Born, Schrödinger, Dirac and Hilbert (p. 8). But the book 
really only deals with the case of Pauli in any systematic fashion. In fact, 
were it not for Pauli, the hitherto neglected connection between the two 
domains would look very slim indeed. Now I think that Hendry has madea 
useful historical point in situating the early articulation of Pauli’s funda- 
mental and enduring views in the philosophy of physics within his critique 
of Weyl’s unified field theory. I do however have some strong misgivings 
about the manner in which he exploits this point in the book. 

In Chapter 2, Hendry highlights, if I read him correctly, four features 
within Pauli’s critique that were later to have important repercussions in his 
contributions to quantum theory. They are (i) the explicit adoption of an 
operationalist approach to the meaning of physical quantities, (ii) the belief 
in the irreducibility of corpuscular concepts to a pure field (continuum) 
formalism, (iii) the belief that not only classical laws, but also the 
fundamental classical concepts would have to undergo revision, and (iv) the 
pragmatic acceptance of a phenomenological description of microphe- 
nomena in the absence of anything better. 
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‘Taking (iii) and (iv) first, the arguments cited by the author to substantiate 
these features of Pauli’s thinking seem to have as much to do with the state of 
the old quantum theory as with unified theory per se. In particular, (iii), 
which was to be so prominent in the later Bohr—Pauli dialogue, is seen to 
arise out of Pauli’s analysis of the wave-particle ‘contradiction’ for light 
(p. 20). But this is a question that arises squarely in the quantum ‘prob- 
lem complex’, and quite independently of the current issue of unified 
field theory, where waves were waves and particles (reducible or otherwise) 
were particles. Feature (ii) is, as we saw earlier, largely a consequence of (i), 
and Hendry stresses the importance of the notion of discreteness in Pauli’s 
later critique of the 1925—6 quantum formalisms. But is the operationalism 
in (i), which provided ‘the strongest connection between the debate sur- 
rounding Weyl’s theory and the genesis of quantum mechanics’ (p. 14), 
strictly an outcome of Pauli’s studies of the post-1915 work of Einstein, 
Wey] and others? 

Significantly, in this section of the book Hendry also cites Pauli’s 
criticism, written in a letter to Eddington, of the view that Maxwell’s 
equations retain their validity in the vacuum (i.e. for the free field), and that 
it is only in the matter-radiation interaction that the treatment need be non- 
classical (p. 21). Again, the criticism is based on the operationalist analysis of 
the field concept, but the addressed problem is one that arises in the old 
quantum theory. Why is Pauli’s operationalism, when directed to Weyl’s 
unified field theory, any more significant for the future development of 
quantum mechanics than when it is directed to fundamental questions in the 
‘old’ theory of quantum radiation? Admittedly, Pauli’s letter to Eddington 
was written several years after his early operationalist critique of Weyl’s 
theory. But the question remains: what is the source of Pauli’s 
operationalism? 

Hendry does not really fully address this question, although he does quote 
Pauli on p. 23 as saying, again in the letter to Eddington, that the greatest 
achievement of relativity was ‘to have brought the measurement of clocks 
and measuring rods, the orbits of freely falling mass points, and those of 
light rays into a firm and profound union.’ Now one might construe this to 
mean that operationalism is a lesson to be learnt from relativity theory. But 
here one would want to say that relativity theory is not unified field theory, 
and moreover that the lesson was surely learnt first, if relativity is relevant at 
all, from the special theory of 1905. Indeed, many would argue that the 
general theory of 1915 marked a significant departure from the operationa- 
lism in the special theory (to the extent that it was ever systematically 
present there in the first place). It would appear that Pauli did not take this 
line, but this does not affect my point. All this ties in with the-well-known__ 
fact, evidenced later in the book, that when the founders of quantum 
mechanics, and particularly Heisenberg, cited relativity theory as a source of 
methodological inspiration, or at least precedence, it was the special and not 
the general theory that was being referred to. 
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Turning now to the question of Pauli’s influence in the birth of quantum 
mechanics, as mentioned earlier Hendry sees Heisenberg’s debt to Pauli as 
enormous, with Heisenberg having adopted both Pauli’s phenomenological 
and his operationalist approach in reaching the new kinematics in 1925. But 
it is worthwhile glancing at the author’s detailed account of this episode in 
Chapters 4 and 5. Pauli’s role here was essentially twofold: he criticised 
Heisenberg’s early work on the core model of atomic structure as being 
wholly inadequate from the point of view of providing a coherent conceptual 
foundation for the physics of the atomic structure, and he set out the 
heuristic guidelines—largely based on the four features of his thinking listed 
above—for future work. Hendry states on page so that Pauli’s views on the 
inadmissibility of the concept of electron orbits within the atom . . . may be 
traced back to his criticisms of Weyl’s attempt at a unified field theory’. But 
on the same page he mentions Heisenberg’s recollection that rejection of the 
notion of electron orbits was shared by both Pauli and Heisenberg during 
their student days together in Munich. (Which of the two students 
influenced the other in this respect is not a question the author chooses to 
raise.) Moreover, it seems that Heisenberg’s rejection of the atomic core 
model was a consequence not of the fact that it was, as Pauli had argued, 
philosophically unsound (Heisenberg himself stressed that it only had a 
‘symbolic sense’, p. 50), but that it simply didn’t work (p. 65). Heisenberg 
now moved towards an ‘empirically-oriented’, i.e. phenomenological, 
approach similar to that advocated by Pauli. Interestingly, Hendry’s 
account of this transition seems at one point to suggest that Pauli’s influence 
was only of secondary importance: “This new and specific expression of 
Pauli’s [methodological] ideas [in late 1924, after Pauli had briefly ab- 
andoned work in quantum atomic theory] was well timed, for it coincided 
with Heisenberg’s move in the very same direction’ (p. 65). However, the 
final ingredient in Heisenberg’s thinking leading to his 1925 breakthrough 
was the adoption of a fully fledged operationalist stance, and this Hendry 
attributes unequivocally to Pauli’s influence (pp. 66, 69, 118, 122, 126, 130). 

Taken as a whole, Hendry’s own account suggests, I think, that 
Heisenberg’s route to the new kinematics was a complicated one, relying in 
part on Pauli’s strictures when convenient, but to a large extent guided by 
his own instincts, and by the lessons he had learnt working on, inter alia, the 
ill-fated core model. (Given Heisenberg’s independent spirit, it would be 
surprising were it otherwise.) This reading, together with the doubts 
expressed above concerning the origins of Pauli’s operationalism, to my 
mind justify a certain degree of scepticism regarding Hendry’s rather 
startling claim in the Introduction that ‘Heisenberg’s work is also rooted ... 
in the search for a unified general relativistic field theory’ (p. 7). 

I now move on to the end of the story and Hendry’s treatment in Chapter 
9 of the appearance of the uncertainty relations and Bohr’s principle of 
complementarity. In his account of the tortuous development of 
Heisenberg’s thinking between late 1926 and early 1927, which led to the 
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enunciation of the uncertainty principle and its physical significance 
(pp. 111-118), Hendry leaves no doubt as to Heisenberg’s reliance on 
Pauli’s advice and criticism. But to my mind the treatment lacks the 
clarity and incisiveness that is a feature of the commentary in preceding 
chapters; at times there is a hint of uncertainty as to precisely what 
Heisenberg thought he was doing. It is, however, Hendry’s description 
of the emergence of Bohr’s principle of complementarity in his 1927 
Como lecture, and its reception by Pauli (pp. 118-126), that I wish to 
examine in detail. 

After reviewing various arguments presented in the Como lecture (to 
which I will return shortly), Hendry provides the following account of the 
import of Bohr’s thinking: 


Heisenberg had adopted Pauli’s operationalist creed that observability and define- 
ability should be equated in a consistent theory. But unable to create such a theory he 
carried his creed into his analysis of a theory that was still, in terms of its conceptual 
foundations, inconsistent. Bohr’s analysis showed that this inconsistency lay in the 
very fact that an operational definition of the kinematic concepts needed was 
impossible. The ideals of observation and definition, both necessary to any physical 
theory, were in fact incompatible. Bohr defined this combination of joint necessity 
and mutual incompatibility through the notion of ‘complementarity’, and from the 
complementarity of observation and definition he derived that on space-time 
description and causality, and that of the wave and particle pictures. (p. 126) 


Hendry’s view is then that the notion of complementarity grew out of the 
observation-definition conflict; the (more familiar) complementarity of 
space-time description and causality, and that of wave and particle pictures 
are consequences of, and thus less fundamental than, the complementarity 
of observation and definition. (Later, on p. 131, he repeats this: ‘Bohr did 
present his principle of complementarity in terms of the relationship 
between observation and definition, and not in terms of what was in fact the 
deduced relationship between wave and particle pictures.’) 

Now although this interpretation of the complementarity principle is 
somewhat abstract, and I think not entirely transparent, it is nonetheless 
crystal clear that it is incompatible with operationalism as it is normally 
understood. Definition and observation just cannot be ‘mutually incom- 
patible’ if operationalism is to make any sense. If Hendry is right about this 
notion of complementarity being what was central in Bohr’s thinking, and if 
we grant, as Hendry implicitly assumes, that Pauli (the champion of 
operationalism) read Bohr correctly on this crucial point, then one might 
find the fact that Pauli was quick in his support of Bohr’s new doctrine as 
prima facie puzzling. Not so Hendry. ‘Pauli, as may be expected, accepted 
Bohr’s new ideas enthusiastically ...’ (p. 126, my italics). But how can this 
be? Bohr’s reasoning, as Hendry construes it, represents a rejection of what 
Hendry stresses to be the central component of Pauli’s thinking ever since 
his youthful critique of unified field theory, and which once adopted by 
Heisenberg led to the discovery of the physics that Bohr was now trying to 
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interpret. Hendry in fact goes on to concede this point. “The main force of 
the complementarity principle lay in its demonstration that for quantum 
phenomena any operationally defined system of concepts was impossible, 
the processes of operation and definition being themselves incompatible, 
and in this sense the principle represented a victory for Bohr over Pauli.’ 
(p. 126). But, he continues, ‘. . . Bohr had also now admitted that the 
classical conceptions in particular were incapable of consistent application, 
and that this limitation arose, as Pauli had always argued, from their being 
operationally ill-defined’. (p. 126, my italics). Yet if operationalism is ‘im- 
possible’ for quantum phenomena, then Pauli’s original argument in 
favour of the invalidity of classical concepts was, in so far as it is based on 
operationalism, simply wrong. 

It is hard not to conclude that given Hendry’s construal of the 
complementarity principle, Pauli’s ‘enthusiastic’ support of Bohr’s new 
philosophy remains somewhat mysterious. Moreover it is hard to avoid the 
suspicion that Hendry wants to have it both ways: Pauli was wrong about 
operationalism, and yet he was right. This ambivalent, if not incoherent 
state of affairs is reinforced in the following passage found in Chapter 1o 
(Concluding Remarks). 


The long sought for new system of operationally defined concepts upon which 
quantum mechanics was to have been built upwards had remained elusive, and in 
this sense Bohr’s views had finally prevailed. But this inadequacy of existing 
concepts, and in particular of visualisable models, had been established and given a 
foundation. As Pauli had always maintained this foundation lay in their operational 
inadequacy, expressed in terms of a complementarity between definition and 
observation which effectively prohibited any operationally based definition (p. 129, my 
italics). 

Let us return now to Bohr’s doctrine of complementarity. Far be it for me 
to say that the logic of this doctrine, especially in its first tentative 
articulation in the 1927 Como lecture, is easy to grasp in its entirety, and that 
it is not open to a range of interpretations; yet I do not find Hendry’s 
interpretation of it convincing. The alleged primacy of the complementarity 
between observation (or ‘operation’) and definition follows from Hendry’s 
reading of the following passage from the Como lecture (pp. 125, 126). 


... the quantum postulate implies that any observation of atomic phenomenon will 
involve an interaction with the agency of observation not to be neglected. On the one 
hand, the definition of a state of a physical system, as ordinarily understood, claims 
the elimination of all external disturbances. But in that case .. . any observation will 
be impossible, and, above all, the concepts of space and time lose their immediate 
sense. On the other hand, if in order to make observation possible we posit certain 
interactions with suitable agencies of measurement, not belonging to the system, an 
unambiguous definition of the state of the system is no longer possible, and there can 
be no question of causality in the ordinary sense of the word. 


It is perhaps curious that Hendry does not cite the next sentence in Bohr’s 
lecture, which seems to strengthen his case: “The very nature of the quantum 
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theory thus forces us to regard the space-time coordination and the claims of 
causality . . . as complementarity but exclusive features of the description, 
symbolising the idealisation of observation and definition respectively’ (my 
italics). 

Yet is there anything in the cited passages that unambiguously implies an 
outright rejection of operationalism? I doubt it, but that such a view is 
possible is, I grant, an outcome of Bohr’s peculiar mode of expression. 
However, it is plausible that Bohr’s wording in 1927 did not really do justice 
to his fundamental insights. It is interesting to compare Bohr’s reasoning 
above with a later and more elaborate version of essentially the same 
argument. 


Any phenomenon in which we are concerned with tracing a displacement of some 
atomic object in space and time necessitates the establishment of several coinci- 
dences between the object and rigidly connected bodies and movable devices which, 
in serving as scales and clocks respectively, define the space-time frame of reference 
to which the phenomenon in question is referred. Just this situation implies, 
however, a renunciation of any sharp control of the amount of momentum or energy 
exchanged during each coincidence between the object and the separate bodies 
entering into the experimental arrangements. Inversely, every phenomenon in 
which we are essentially concerned with momentum and energy exchanges—and 
which therefore necessitates an experimental arrangement allowing at least two 
successive determinations of momentum and energy quantities—will, in principle, 
imply a renunciation of the control of any precise space-time coordination of the 
objects in the time intervals between these measurements (Bohr [1939]). 


Here, Bohr is not contrasting the unobserved system with that under 
observation; he is comparing two (exclusive) measurement arrangements, 
one to determine the space-time trajectory, the other to test the momentum- 
energy conservation laws. (These laws take the place of the ‘claims of 
causality’ in the 1927 paper, a terminology unique to that paper. See Scheibe 
[1939], p- 30.) The complementarity between ‘space-time coordination’ and 
the dynamical conservation laws cannot be read here as the consequence of a 
putative complementarity between observation and definition. Moreover, in 
so far as it is a crucial feature in Bohr’s philosophy to restrict the definition, 
or applicability, of space and time coordinates for the system to the first 
‘phenomenon’, and that of momentum and energy to the second, it might be 
said that, indeed, there is a clear leaning towards operationalism in Bohr’s 
thinking, rather than away from it. (This point is widely recognized; see, for 
example, Hooker [1972], pp. 167—171.) Now could it not be that Pauli 
perceived this from the outset? If so, this may be considéred one of the 
reasons why Pauli could give his support to the doctrine of complementarity 
in 1927, but clearly such an account is incompatible with Hendry’s 
reconstruction of the episode. 

The book deserves, and I would like that this review end on a more 
positive note. Of the features that I found admirable in the book, there are 
two I would single out for mention. 

First, there is Hendry’s treatment of the thorny issue regarding the 
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repercussions for the development of quantum mechanics of the intellectual 
and cultural milieu in Europe, and particularly Weimar Germany, with its 
prevailing hostility to causality, determinism, and to some extent, realism. 
In the book, Hendry convincingly demonstrates the limitations within 
historical accounts which correlate major theoretical trends within the 
growth period of quantum physics with specific tastes and tendencies in the 
surrounding cultural environment. While careful not to denigrate the 
seminal work of Forman and others in the field, he shows that it was not 
causality per se, but the internal status of the physics (and particularly the 
momentum-energy conservation laws) that, during the period in question, 
mattered most. 

Secondly, the constant emphasis on foundational issues in the book gives 
an illuminating insight into the degree of factiousness and dispute among 
many of the leading contributors to the development of the new theory. The 
detailed treatment of the 1924 Bohr-Kramers—Slater and the 1925 
Born—Heisenberg—Jordan ‘collaborations’, and of the early reaction to 
Born’s 1926 probabilistic interpretation shows how little unanimity in 
foundational questions there was, at least prior to 1927, amongst those 
whose names are associated today with the Copenhagen interpretation. Of 
course this, in itself, is no revelation, but I am not sure that the evidence has 
been as well marshalled elsewhere. 

I found very few printing errors in the book, the worst (but still very 
obvious) one being on page 34, where ‘energy conservation’ should read 
‘energy non-conservation’ in line 18. 


HARVEY R. BROWN 
University of Oxford 
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KYBURG jr, H. E. [1984]: Theory and Measurement. Cambridge University 
Press. 


This monograph in the Cambridge Studies in Philosophy series deals with 
topics in the foundations of measurement and the relation between 
measurement and theory from the point of view of the philosophy of science. 


Reviews 507 


The treatment is partly formal, some of the informal passages are not easy to 
read, and, as far as statistical issues are involved, prerequisite knowledge is 
assumed. The book is of main interest for philosophers of science; scientists 
interested in the topic of measurement will perhaps find it difficult to read. 


Contents Leaving aside the usual items as well as chapter 2, introducing 
some basic concepts, the book has nine chapters. In chapter 3 a picture of the 
succession of qualitative theories (Kyburg uses the term ‘language’ where 
most other people would speak of a ‘theory’) about a given range of 
phenomena is drawn by considering the example of a succession of theories 
about length-comparisons. In each step new axioms about the well known 
observable relations ‘<’ (‘is shorter than’) and ‘O’ (‘concatenation’) are 
introduced, and the effects of the addition of the new axioms on the number 
of erroneous atomic judgements about these relations as well as on the 
number of observable predictions is studied. Kyburg proposes two prin- 
ciples that should be obeyed in eliminating contradictions which arise 
inevitably if many observations (including erroneous ones) are made. The 
first principle roughly says that only a minimal number of observation 
statements should be rejected in order to achieve consistency. The second 
atates that (provided that first one is satisfied) the rejections should be 
(approximately) equally distributed among the different kinds of observ- 
ation statements where ‘kinds’ are given by the different primitives and 
negation. In chapter 4 the example of chapter 3 is continued. In order to 
account for strong and idealising axioms in the light of possibly erroneous 
observation statements, Kyburg in addition to the ‘observable’ relation < 
introduces a ‘theoretical’ predicate <* of the same type for which 
appropriate axioms can be stated without getting immediate contradictions. 
By means of <*, an equivalence relation ~* (‘indistinguishability’) is 
defined. On this basis the ‘true’ length function is defined to assign to each 
object x its equivalence class [x] under ~*. Next, a measurement relation is 
introduced expressing, roughly, that measurement of x yields the result 
‘s units’. This relation is explicitly defined, the definition stating essentially 
that x cannot be observationally distinguished in length from a concate- 
nation of s copies of the unit. Third, the error made by measuring x to be s 
units long is defined very roughly as the difference between s and a number r 
such that r units are theoretically indistinguishable (1.e. by ~*) from x. The 
existence of r is stated in T6, page 68 (where ‘+’ is misprinted for ‘=’). A 
formal treatment of the results of chapter 4 is added as an appendix. We were 
not able to follow the proof of T6 there, the theorem again being misprinted 
(replace ‘LE f(x) =r'b} by ‘LF(x) =r-[b,]’). The rest of the chapter is 
devoted to an informal discussion of how to infer the distributions of errors 
which occur when many measurements are made. It is argued that the two 
principles mentioned in fact provide for such inference. 

In chapters 5, 6 and 7 Kyburg distinguishes ‘direct’, ‘indirect’ and 
‘systematic’ measurement. The distinction, it is admitted, is not very shapr. 

LL 
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Chapter 5 contains various brief descriptions of examples for ‘direct’ 
measurement, t.e., measurement which does not refer to measurement of 
another quantity. Similarly, chapter 6 contains examples of indirect 
measurement which involve measurement of other quantities. Chapter 7 
deals with systematic measurements involving ‘three or more quantities that 
are systematically related’. Examples are taken from Euclidean geometry, 
thermodynamics, and the theory of electricity. 

In chapter 8 the question of comparing and ‘reducing’ dimensions is 
addressed. Chapter 9 elaborates in general on the idea of obtaining 
distributions of error from sets of measured values along the lines indicated 
in the example of chapter 4. The procedure is explained in detail for an 
example of the finite scale (pp. 193—5). The cases of indirect and systematic 
measurement are discussed informally. In chapter 1o—among others—it is 
shown how measurements as performed in a community may yield 
considerable improvements of predictive content. Chapter 11, finally, 
compares Kyburg’s account rather briefly with others known from the 
literature. 


Achievements 'The most important idea contained in this work in our 
opinion is the idea that measurement can play a role in the development of 
science (of theories) only if it goes together with estimations of error. 
Though the idea stated in this vague formulation is not new, Kyburg’s book 
is (as far as we know) the first attempt to work it out in some detail. In 
chapters 3 and 4 as well as in chapter 9 he shows how to deal with erroneous 
observation statements, how to infer distributions of error, and how the 
latter function in the assessment of a theory. The introduction of ‘true 
values’, ‘measured values’ and ‘error’ in chapter 4 is admirably simple and 
elegant. No further account of the subject could possibly bypass this 
treatment. E 

Second, the way he treats the distinction between direct, indirect and 
systematic measurement, namely in a rather pragmatic way, not looking for 
deep distinctions, deserves attention. We believe that it is only by avoiding 
basic distinctions like that between fundamental and derived measurement 
from the outset that the general features of measurement can be approximat- 
ively grasped. 

Further items to be mentioned are his general frame which distinguishes 
object- and meta-language without becoming clumsy; which distinguishes 
levels of certainty of sentences for an individual, and which incorporates 
original ideas about probability and sampling drawn from earlier work of the 
author. 


Criticism “Theory and Measurement’ is an ambitious title which easily 

evokes criticism. We concentrate on a few points neglecting minor items. 
First of all, the whole book has a strong flavour of ‘in principle’ as far as the 

interplay of theory and measurement is concerned. We have no problems in 
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construing a result of some measurement as an atomic sentence but we find it 
difficult to suppose that, in general, theories are formulated in just the 
vocabulary which is given by those atomic sentences. Of course, Kyburg 
does not require that all terms of a theory be measurable. But what if no term 
is measurable? The artificial duplication of terms (< and <*) in his 
treatment of length certainly does not reflect science as it ts, And what 
should we make of the statement that ‘the most sophisticated theoretical 
structures can be regarded as frameworks for systematic measurement’ 
(p. 258) considering, say, quantum mechanics or general relativity? It seems 
to us that in a certain respect Kyburg’s notion of a scientific theory 
(‘language’ as he says), which, in the tradition of logical empiricism he takes 
to be just a class of (formal?) sentences, is too narrow in order to represent 
what really goes on in scientific developments. Typically, his ‘theories’ used 
by way of examples are those of extensive measurement, Euclidean 
geometry, the law of thermal expansion and Ohm’s law; certainly not the 
most interesting representatives of the species. Of course, we do not want to 
require that all kinds of ‘Kuhnian’ aspects are taken into account, but a brief 
look at ‘real life’ theories reveals a wealth of structure and distinctions, the 
treatment of which by just a class of sentences can only be called an 
impoverishment. It is his narrow conception of theories that causes the 
above-mentioned flavour. The immediate question one wants to ask is: “Yes, 
but does all this apply to present day theories, too?’, and we have some 
reservations about the answer. 

Second, there is some disequilibrium between ‘theory’ and ‘measure- 
ment’. Much is said about measurement but little about theory. The only— 
admittedly important—point where theory comes in is in transitions from 
one theory to another, where considerations of numbers of erroneous 
observation statements yield a notion of predictive observational content 
and thereby a notion of which theory is to be preferred. In the light of the 
first point of criticism, the latter notion seems to be of a rather limited scope. 
The standard problems one would expect to be treated under the heading of 
theory and measurement, ‘confirmation’, ‘test’, ‘refutation’ are mentioned 
only very superficially in chapter 11. On page 240, in discussing an 
alternative approach, Kyburg goes so far to say ‘I am regarding scientific 
laws and hypotheses as . . . and therefore untestable and irrefutable’, but two 
pages later he uses the phrase ‘tested and refuted’ himself. 

Third, we think that his concept of the ‘true value’ is too limited. It will 
work in examples similar to those treated in the book, but not in general. 
The true value, as defined in chapter 4 for length of x is [x], the equivalence 
class of x under ‘theoretical indistinguishability’ ~*. By analysis of the 
definitions involved this roughly amounts to the following: The true value of 
x is the one given by a theoretical ordering <* which is ‘compatible’ with the 
observational ordering <. Without further explication of ‘compatible’ it can 
be seen that the ‘true’ value is determined by an observational relation plus 
its theoretical counterpart. We do not see how this definition could be 
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transferred to cases of comprehensive theories without considerable change 
in the meaning of ‘true value’ as it is actually used. 

These criticisms are perhaps too demanding, and not all could be met by 
any other current approach. They indicate, however, the direction in which 
Kyburg’s work might be continued. As one step in a progreasive series of 
accounts, Theory and Measurement seems to achieve as much as could be 
achieved by such a book at the present time. 


W. BALZER 

Seminar fuer Philosphie, 

Logik und Wissenschaftstheorie, 

j Universitaet Muenchen 
C. M. DAWE 

Department of Statistics, 

Birkbeck College, 

University of London 


GINGERICH, OWEN (Ed.) [1984]: Astrophysics and twentieth-century 
astronomy to 1950, The General History of Astronomy, Vol. 4A. 
Cambridge University Press. Pp. x+ 198 (ISBN 0-521-24256-8) 


The General History of Astronomy series, of which this part is the first to be 
published, aims to present to the non-specialist reader an account of the 
history of astronomy from the earliest times. The period covered here 
extends from around 1850, when the first progress in understanding the 
nature of the Sun heralded the beginning of the subject of astrophysics, to 
about 1950, when the advent of the first electronic computers was beginning 
to permit the first detailed accounts of the evolution of stars. In this, the first 
part of volume 4, the emphasis is on early astrophysics, instrumentation and 
institutions; it excludes the solar system, the theory of stellar evolution and 
cosmology which are promised in part B. For this period in particular the 
series fills a notable gap: many works on ancient astronomy, although not 
necessarily ‘popular histories’, are nèvertheless readily accessible to non- 
specialists, and contemporary astronomy has no shortage of popular 
expositions, but this is the first comprehensive general account of the 
‘modern’ period. 

It is a good time now to review the history of astronomy because another 
period of rapid innovation is taking shape. We can seen this clearly if we 
compare the early work on the physics of spectral lines initiated by 
Kirchhoff at the beginning of the period covered here with Kepler’s 
planetary laws and with contemporary particle physics. All three cases 
represent a fruitful synthesis of physics and astronomy. Kepler’s Laws were 
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important not just for the motion of the planets but for the eventual 
revelation of general laws of motion. Kirchhoff’s understanding of the 
nature of emission and absorption lines led not just to the determination of 
the nature of the stars, but also to the physics of atoms. The nature and 
structure of the early universe (around 10733 sec ‘after’ the big-bang) will 
not only explain the formation of matter and radiation, but will act as a probe 
of the fundamental structure of sub-atomic matter. Astronomical con- 
ditions often cannot be reproduced in a laboratory, so the only avenue by 
which possible solutions to astronomical problems become available is 
through theoretical physics. So new areas of progress in astronomy become 
areas of importance in physics and vice versa. This book charts the rise of 
what has come to be known explicitly as astrophysics. 

In his preface the editor of this part, Professor Gingerich, sets high goals 
for the general historian: he must seek to transport his reader to another age, 
helping him see the problems as they appeared, the context of theories, the 
available apparatus, and the network of communication within and outside 
astronomy. This is no mean task, especially in a sectionalised treatment 
having different authors. For example, the period covers not only the 
emergence of astrophysics, but also the decline of celestial mechanics. New 
exciting things tend to emerge in identifiable places or at identifiable times 
or be associated with particular people. Declines tend to be a bit more 
difficult to pin-point, and to be less of the responsibility of a particular 
author. So we are told declines occur, but we rarely see them. Yet they are 
not without interest, and historical relevance from which we can learn. Take 
for example, the Mt Wilson observatory where in the 1920s astronomers 
used the 100-inch telescope to discover the fundamental expansion of the 
Universe. Today it is a historic relic with none of the computerised control 
of a serious instrument. Yet its staff assemble in the same hierarchy to dine 
in the same 19208 style, forming a closed community amongst lockers still 
labelled with the names of astronomers long since dead. One also looks in 
vain for information on the almost terminal decline of British observational 
astronomy between the Wars, although the lesser importance of Europe 
generally is apparent. 

How much of this is of interest for philosophy? There is here an abundant 
source of interesting but false theories. For example, to explain an otherwise 
unidentified spectral emission line in solar prominences, Lockyer postu- 
lated a new element, helium, and met with strong opposition until helium 
was finally detected in the laboratory nearly thirty years later. This 
facilitated the acceptance of the new elements coronium and nebulium as 
explanations of other unexplained lines in the solar corona and in gaseous 
nebulae. But eventually these were understood on the basis of the atomic 
theory as lines formed under high temperatures or low densities unat- 
tainable (even now in some cases) in the laboratory. Equally illuminating is 
the problem of stellar structure. First of all enough laboratory physics was 
required to enable the gaseous nature of the solar interior to be deduced 
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under conditions of high pressure. Then the problem was to explain the 
different colours and luminosities of the stars. This began with classifi- 
cation, the scheme of Hertzsprung and Russell eventually emerging as the 
best. The interpretation was hampered by a lack of stellar radii and it was 
doubted that conditions in the Sun could be extrapolated. Eventually 
Eddington, aware that the Michelson interferometer under construction at 
the 100-inch telescope on Mt Wilson was capable of measuring large stellar 
diameters, published his prediction of diameters for giant stars. The Mt 
Wilson staff rushed to complete their interferometer, confirming 
Eddington’s prediction: the diameter of Betelgeuse was indeed larger than 
the Earth’s orbit. 

Much of interest emerges from the book on how astronomical research 
was carried out. There is for example, the inevitable clinging to old ideas. 
The introduction of photographic techniques extended over fifty years, 
pioneered largely by amateurs. Early attempts were indeed unsuccessful, so 
much so that the 1881 Congress banned its use for the 1882 observations of 
the transit of Venus. But one cannot help feeling that some of this failure was 
welcomed as an opportunity to avoid change, rather than to put in the effort 
to make photography work. Its eventual introduction had, of course, 
unimaginable scientific consequences, but also an interesting sociological 
result. Photography led to the accumulation of large amounts of material 
requiring routine analysis which was considered suitable for women and led 
to their entrance into astronomical research. It also meant that data could be 
gathered in remoter parts of the world and sent back to the parent institution 
for analysis, a state of affairs which not unsurprisingly led to friction. 

A similar situation prevailed with respect to the introduction of reflecting 
telescopes. These did not attain complete dominance until the early years of 
this century. After that advances in instrumentation accelerated enorm- 
ously. Since there is no point in making serious proposals to do the 
impossible new instrumentation has always depended on what could be 
engineered at the time. It would have been interesting to learn more of the 
importance of scientific aspects in determining the pace of instrumental 
advance. Much of the development depended on private donations. Had all 
benefactors, like Mr Charles T. Yerkes, simply ‘dreamed since boyhood of 
the possibility of surpassing all existing telescopes’, and were, indeed, the 
great scientist/builders attracted simply by the machinery? Were they aware 
that a bigger telescope might simply see more of the same? For example, the 
200-inch aperture of the Palomar telescope seems to have no more 
significance than being half way between the 100-inch Mt Wilson mirror 
and the best guess for the then largest possible mounting of around 300 
inches. A related point, hinted at in the text, is the dependence in general of 
pure scientific instrumentation on advances in other areas. Engineering 
lessons from the Golden Gate Bridge made possible the Palomar 200-inch 
telescope, not the other way round. The one notable exception was naturally 
enough, the large telescope mirrors themselves. Indeed, Hale contracted for 


Reviews 513 


the building of the dome and support of the rco-inch telescope before a 
satisfactory mirror had been shown to be possible. 

This remark also serves to highlight the importance of individuals even in 
‘big-science’. Hale was undoubtedly a genius at extracting money for 
astronomy whether from government agencies, charitable institutions or 
private individuals, although—and the book does not make this point—the 
amounts given to astronomy were not large in comparison with donations to 
other causes. Also important was the increasing need to organise on a larger 
scale. Again it would be interesting to compare other areas. What is 
important is not just the size of a project, but its degree of integration: a lot of 
nuts and bolts may cost a lot of money but present no great difficulty. In this 
sense astronomical projects are generally not large, except when compared 
with most other areas of pure science. Even the organisation of the scientific 
part of the space-programme derives from the much greater military 
capability. Indeed, one of the major problems of contemporary astronomy is 
the apparently insuperable difficulty of coordinating observations of the 
same object using different instruments. It would be interesting to see these 
points emerge. . 

One attempt that was made at a grand coordination was a failure. This was 
the Carte du Ciel, an attempt to map the whole sky photographically, with 
various exposure times, involving the cooperation of some eighteen 
observatories. The project started in 1887, was finally ‘completed’ in 1964, 
producing a catalogue of no discernible value, having consumed vast 
amounts of observing time in the pure pursuit of the possible. The 

Americans declined to participate in the project, and this certainly 
contributed to the more rapid advance and eventual leadership of the US in 
the field of astrophysics. 

The study of the history of astronomy is not highly regarded by many 
practitioners of astronomy. It looks too much like an academic game played 

‘between those not willing to do astronomy proper, on a par say, with writing 
a history of history books. In general, a scientific discussion can only become 
important if it ceases to be restricted to its immediate practitioners. I believe 
that scientists should be aware of the historical development of astronomy, 
especially at a time when a combination of grand projects and economic 
difficulties has all but brankrupted British astronomy. It seems to me that 
this series will do much to establish the wider importance of the subject. 


DEREK J. RAINE 
Department of Astronomy, University of Letcester 
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SKYRMS, BRIAN [1984]: Pragmatics and Empiricism. Yale University 
Press. Pp. xi+143. £16.95. (ISBN 0-300-03174-2.) 


This short book consists of six chapters. The middle four chapters (which 
constitute three-quarters of the text) develop a Subjective Bayesian 
approach to several issues in the philosophy of science. The two chapters 
flanking the heart of the book attempt to place Skyrms’ Bayesian approach in 
a tradition of work rooted in logical positivism. Readers who are already 
reasonably comfortable with a Bayesian approach and who are reasonably 
familiar with standard Bayesian methods and results will welcome Skyrme’ 
book into the growing Bayesian empire. Others will not likely be persuaded 
to adopt a Bayesian approach and may well be frustrated by Skyrms’ dense 
and demanding style. 

In the first chapter of the heart of the book (Chapter 2) Skyrms introduces 
the central foundational arguments which link probability to a person’s 
degrees of belief. He presents a version of the Dutch book argument. He 
discusses standard probabilistic representation theorems for decision 
making in the face of uncertainty. Finally he works at dispelling worries 
about higher order degrees of belief about degrees of belief. Overall his aim 
is to justify a full blooded use of Subjective Bayesian probabilities. While he 
acknowledges that the Subjective Bayesian framework is not above criti- 
cism, ‘its foundations compare favorably with most of what passes as 
epistemology’ (p. 36). 

Comfortable with such foundations, Skyrms then employs the Bayesian 
framework to help solve three problems much in discussion in recent 
philosophy of science. He includes a chapter apiece presenting and 
defending a Bayesian theory: of induction (learning from experience); of 
rational (causal) decision making, and of subjunctive conditionals. In each 
case Skyrms shows how a close analysis of a rational person’s degrees of 
belief will reveal fully adequate solutions to standard problems which 
confound other treatments. 

Each of these three chapters offers a genuine contribution to ongoing 
work in the three respective areas of inquiry. Skyrms develops a general 
theory describing how a Bayesian can learn from experience about chances, 
together with a deFinetti style reduction of the concept of chance to 
Subjective degrees of belief. His theory is structurally identical to classical 
ergodic theory. Skyrms’ theory of rational decision making is sensitive to 
causal considerations while, at the same time, allowing for a similar 
reduction of the concept of cause to Subjective degrees of belief. Thus his 
theory can cope with Newcomb style problems—because it is sensitive to 
causal considerations— without requiring epistemologically suspicious con- 
cepts that evidential decision theorists object to. Finally Skyrms presents a 
general theory of subjunctive conditionals that has the conflicting theories of 
Adams and of Stalnaker and Lewis as special cases. Here again the only 
necessary conceptual apparatus are Subjective degrees of belief. 
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In the first chapter of the book Skyrms argues that the machinery 
provided by Subjective probability is just what has been needed to fulfil the 
logical positivist programme in philosophy. The logical positivists could not 
find a satisfactory means for distinguishing metaphysical (rubbish) from 
genuine empirically based claims to knowledge. Skyrms briefly recounts 
their attempts to provide a criterion of empirical meaningfulness first in 
purely syntactic terms and subsequently in semantic terms. Skyrms 
provides such an account in the pragmatic terms of Subjective probability. 
While pragmatics—‘the study of the relations of signs and their signifi- 
cations to sign users’ (p. 10)—has commonly been seen as a part of 
psychology or sociology, Skyrms rightly points out that Subjective prob- 
ability, being based in a normative model of the rational person, provides an 
excellent example of philosophical pragmatics. Put briefly, Skyrms’ 
Bayesian criterion of empirical meaningfulness holds that a proposition, p, is 
empirically meaningful for a particular person at a particular time if and only 
if there is some empirical evidence which the person believes is possible 
(assigns non-zero probability to), the obtaining of which would alter that 
person’s degree of belief in p. Empirical meaningfulness is the flipside to 
unrepentant dogmatism. 

Skyrms notes several implications of his criterion of empirical meaning- 
fulness beyond the fact that it relativises the notion of empirical meaning- 
fulness to individual (degree of) believers. ‘One man’s metaphysics may be 
another’s empirically meaningful proposition’ (p. 15). In particular meta- 
physical propositions may well have a truth value, and thus, be meaningful. 
Dogmatically believing the truth, however, is not enough for some one to 
rightly claim knowledge. Knowledge requires some kind of empirical 
justifiability. Thus, Skyrms thinks that the logical positivists were right to 
attempt to distinguish metaphysics from empirically meaningful pro- 
positions; metaphysics cannot be known. 

In the last chapter Skyrms pursues questions raised by the logical 
positivists’ desire to eliminate metaphysics. He distinguishes several 
situations which can give rise to Skyrms-Bayesian metaphysics. In all cases, 
however, Skyrms doubts whether it is necessary or desirable to eliminate 
such metaphysics. Ironically, the elimination of certain Skyrms-Bayesian 
empirically meaningful propositions makes more sense to Skyrms. His 
favourite example is de Finetti’s theorem which shows that under well 
specified conditions—generally referred to with the word exchangeability— 
the notion of objective chance can be seen as a mathematical artifact induced 
by degrees of belief about what will happen. A primary aim of Skyrms’ 
earlier chapters on induction, decision theory and subjunctives is to 
generalise de Finetti’s result and to develop similar results for causal and 
modal concepts. Propositions employing such concepts are Skyrms- 
Bayesian empirically meaningful, but happily they are superfluous. 
Curiously, ‘[p]roof of eliminability is . . . a reason not to eliminate. We 
should, I suppose eliminate gobbledegook from our language and pig- 
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headedness from our way of thinking, but I can see no good reason for 
eliminating all eliminable propositions’ (pp. 118—19). 

Skyrms’ book contains important work. His chapters on induction, 
decision theory and subjunctives present valuable contributions to these 
fields. They read like compilations of separate articles, which, in fact, they 
are, but there is a unifying theme: that contemporary Subjective probability 
provides the needed pragmatic fulfilment to logical positivism’s aims. The 
results are interesting and insightful pragmatic reductions of a variety of 
concepts to Subjective probability. 

However, such a single-minded reliance on Subjective probability does 
not do full justice to the issues involved. The logical positivists sought to 
eliminate metaphysics, not just metaphysics-for-you. Similarly, meta- 
physicians aim to present universal doctrines, not just universal-for-me- 
doctrines. Skyrms’ proofs of eliminability are only reasons not to eliminate if 
the concepts so eliminated were indeed mysterious—in which case Skyrms’ 
proofs shows they are not so mysterious. Yet Skyrms’ treatment of 
metaphysics removes the logical positivist’s explanation for such mystery 
and Skyrms provides no other explanation. Furthermore, Skyrms’ monism 
places an unbearable burden on the arguments establishing Subjective 
probabilities. Subjective probabilities are best viewed as idealisations which 
reveal features of rationality in a variety of areas including learning from 
experience, decision making and the use of subjunctive language. They do 
not reveal all there is to rationality, nor should rationality, and human belief 
more generally, be the only important focus for metaphysics and 
epistemology. 


DAVIS BAIRD 
University of South Carolina 


AL-DAFFA, ALI A. and STROYLS, JOHN J. [1984]: Studies in the Exact 
Sciences in Medieval Islam. University of Petroleum and Minerals 
(Dhahran, Saudi Arabia) and John Wiley and Sons. x +243 pp. (ISBN 
0-471-90320-5). 


This book is a collection of seven essays, which are based on papers 
presented by the authors at conferences. These essays are attempts by the 
authors to achieve some form of synthesis of the relevant published 
literature until ca. 1980. The title, the number of pages and the subject of 
each essay are as follows: = 

1. “Transmission of science and technology between East and West 
during the period of the Crusades’, 18 pp. (A comparison between 
transmission from Greek into Arabic and from Arabic into Latin using the 
examples of Ishaq ibn Hunayn and Gerard of Cremona. The authors also 
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argue that most of the transmission of technology from the Islamic world to 
the West took place via Spain.) 

2. ‘Pythagorean theorems and mumpsimus’, 7 pp. (On Thabit ibn 
Qurra’s generalisation of the Pythagorean theorem.) 

3. ‘Some myths about logarithms in Near Eastern mathematics’, 5 pp. 
(The authors explain that neither logarithms nor a prosthaphairesis formula 
was known in the Middle East.) 

4. ‘Nasir al-Din al Tusi’s attempt to prove the parallel postulate of 
Euclid’, 29 pp. (Also deals with other Arabic ‘proofs’ of the parallel 
postulate.) 

5. ‘Ibn Sina as a mathematician’, 58 pp. (A summary of the mathematical 
parts of his extant works.) 

6. ‘Numerical analysis in the Middle East: ninth through fifteenth 
century’, 12 pp. (A brief survey of work on number systems and 
computation, numerical approximation of roots of equations, determination 
of areas and volumes of curvilinear figures.) 

7. ‘The geometric theory of equations in the Near East in the middle 
ages’, 88 pp. (An analysis of this essay will be given presently.) 

Each essay is followed by a large number of footnotes, and at the end of the 
book there are separate bibliographies for the individual essays. 

In a review of this scope it is obviously impossible to discuss all seven 
essays thoroughly. I will therefore briefly mention some general aspects of 
the book, and then analyse only the seventh essay in more detail. This choice 
is motivated by the consideration that the seventh essay is not only the 
longest, but perhaps also the most inaccessible part of the book. 

To begin with some positive aspects: The best essay is probably no. 5. Ibn 
Sina (Avicenna) was not primarily a mathematician, but some of his works 
contain a good deal of (rather dull) elementary mathematics, of which the 
authors present a useful summary. Essay no. 4 contains a concise mathema- 
tical comparison, from an axiomatic point of view, of several Arabic ‘proofs’ 
of the parallel postulate. Throughout the book, there are scattered remarks 
which betray a good mathematician, for example in the intelligent conjec- 
tures about the (still unpublished) Algebra of Sharaf al-Din al-Tūsī (twelfth 
century) at the end of the seventh essay. The large bibliographies are useful 
(but not well-organised). 

However, the book as a whole is disappointing. The essays are for the 
most part unoriginal, superficial and sometimes incoherent. The book 
contains hundreds of inaccuracies of various kinds (well over a thousand, if 
printer’s errors are included). The treatment of the sources and the 
mathematics’ is sometimes gravely defective, in spite of the intelligent 
mathematical remarks mentioned above. I cannot escape the impression 
that this curious contrast finds its origin somehow in the dual authorship of 
the book. 

My objections to the book will be illustrated by the following brief 
analysis of the seventh essay, on the geometrical theory of solutions of 
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(mostly cubic) equations. The authors begin by mentioning the Babylonian 
background, the Arithmetica of Diophantus, Book II of the Elements and 
early Greek work on conic sections. Then they turn to proposition 4 of Book 
2 of Archimedes’ On the Sphere and Cylinder. There Archimedes reduces 
the problem (1) to cut a given sphere by a plane in a given ratio, to an 
auxiliary problem (2), which is stated in a special form (2a) necessary for the 
solution of (1), and in a much more general form (2b), equivalent to the cubic 
equation x?(a—x) = be? for arbitrary positive a,b,c. The authors do not 
seem to have sufficiently understood the difference between (2a) and (2b). 
They say on page 138 that (2b) is as follows in the translation by Bulmer- 
Thomas: 


it is required to cut the given straight line DZ at X so that XZ bears to the straight 
line ZT as the area (BD)? bears to the square on (DX). 


Here the authors actually misquote Bulmer-Thomas’s translation 
([1951], p. 133), which reads: 


it is required so to cut the given straight line DZ at X that XZ bears to a given straight 
line the same ratio as a given area bears to the square of DX. 


The authors then state that problem (2a) is always soluble ‘since 
(presumably) the problem arises in a “natural way”’ (p. 138, emphasis 
added). As a matter of fact, (2a) is soluble because it is a special case of (2b) 
such that the conditions for the solubility of (2b) are satisfied. I can only 
explain the error by assuming that the authors garbled the following passage 
in the translation of Bulmer-Thomas ([1951], p. 135; emphasis added): 


In the present case [i.e. 2a] the problem will be of this nature: Given two straight lines 
BD, BZ,... 


The next section contains a fairly accurate account of ‘Umar al- 
Khayyam’s geometrical solutions of all types of cubic equations. The 
authors misinterpreted part of Al-Khayyam’s (confused) solution of 
xæ +bx+c = ax? (p. 155, Figure 13). The solution is correctly explained in 
Woepcke’s [1851] edition, pages 51~—2, footnote. 

The authors then devote 50 pages to a discussion of the ‘problem of 
Alhazen’, named after the famous mathematician and physicist Al-Hasan 
ibn al- Haytham (965—1041), who solved this problem in Book V of his great 
Optics. Since there is as yet no published account of the entire solution in a 
modern Western language (Sabra [1982] is only concerned with the 
lemmas), I begin with a preliminary explanation. 

The ‘problem of Alhazen’ is as follows: given a circular mirror with centre 
C, and two points A and B. Required: all possible ‘point(s) of reflection’ R 
on the circumference of the circle such that the angles ARC and CRB are 
equal. We will only consider (with the authors) the case where A and B are 
inside the circle. Notations as V:87 refer to proposition numbers in the 
[1572] Risner edition of the twelfth-century Latin translation (the Arabic 
text is as yet unpublished). 
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The problem is non-trivial if A and B are not on the same diameter of the 
circle. The case AC = BC had already been solved by Ptolemy in his Optics 
(which was known to Ibn al-Haytham) by means of ruler and compass. Ibn 
al-Haytham discusses this easy case in V:70~72, and summarises the result 
in V: 87 (without reference to Ptolemy). 

If AC # BC the problem cannot be solved by means of ruler and 
compass. Ibn al-Haytham first solves a lemma by means of a hyperbola and a 
circle (V: 32-34). This is used in the solution of a second lemma involving a 
right triangle and a transversal (V:35, 38). To return to the problem of 
Alhazen: assume AC > BC, and bisect the arc of the circle contained 
between the radii AC and BC; call the parts « (on the side of A) and $ (on the 
side of B). Let y be the arc opposite a. Ibn al-Haytham shows that R can only 
be on one of these three arcs. He discusses y in V: 73~75, and 4,8 in 
V: 76-83. Constructions of points of reflection occur in V: 73 (on y) and 
V: 82-83(on £). These constructions are based on the lemmas V: 35, 38. The 
results of V: 73-83 are summarised in V.86. 

We now return to the book under review. The authors first note that the 
problem of Alhazen (for AC # BC) is equivalent to an equation of degree 
four. However, Ibn al-Haytham never mentions this fact (and would not 
have gained anything by it), so that one may well argue that his solution does 
not really belong in an essay about geometrical solutions of equations. In an 
atternpt to convey some of the techniques of Ibn al-Haytham, the authors 
then translate the proofs of the propositions V: 32-36, 70—75, 87 and 37, 38 
(in their ‘appendix 2’) from the Latin text in the 1572 edition. The choice of 
propositions and the scheme on page 159 show that they have not 
understood the deductive structure of the solution, as outlined above. It is 
most revealing that they choose the easy case V: 87 as their ‘goal’, and that 
they believe that V: 87 is directly dependent on V: 36 and V:73. V: 36 is not 
used at all in Alhazen’s problem in the above-mentioned sense. Some 
samples of the translation, taken from V:71: ‘secundum modum praedic- 
tum’ is translated as ‘in the second noted manner’ (p. 178); ‘3 librum magnae 
constructionis Ptolemaei’ is translated as ‘the three big books of construc- 
tions of Ptolemy’ (p. 177; actually, Book III of Ptolemy’s ‘Great 
Composition’, i.e. the Almagest, is meant). Thus the authors display their 
innocence of Latin grammar (and of the history of astronomy). Fortunately, 
an excellent translation of V: 32-38 from the Arabic by Sabra has appeared 
in [1982], so that the translation of these propositions by the authors is now 
superfluous. 

The discussion of the sources of Ibn al-Haytham’s solution (p. 172-5) 
should not be taken seriously. Ptolemy’s Optics is dismissed on the ground 
that it contains a ‘false assumption about the nature of reflections’ (p. 175); 
this is probably meant to be a reference to an incorrect principle for 
determining the position of the image of an object. The authors did not 
realise r. that this principle is irrelevant for finding the point of reflection, 
and 2. that Ibn al-Haytham used the same incorrect principle. If they had 
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read Ptolemy’s Optics, they would have realised that Ptolemy had already 
solved the case (V: 87) which they believe to be the ‘culmination’ (p. 171) of 
Ibn al-Haytham’s achievement. The discussion of possible relations 
between Ibn al-Haytham’s solution and the Cutting-off of a ratio of 
Apollonius (p. 174) shows that the authors have misunderstood T. L. 
Heath’s summary of this work which they quote (they have evidently not 
read the work itself). 

In this way one could go on for a long time enumerating errors and 
inadequacies. It seems that the authors are not even sure of the correct name 
of ‘Ibn al-Haytham’; in the majority of cases they give ‘al-Haytham’. The 
confusion is caused by the Latinised form Alhazen, which is derived from 
Al-Hasan, Ibn al-Haytham’s first name. 

This must suffice as an indication of the level of a substantial part of the 
book. There is really no excuse for errors such as have been mentioned. One 
can only hope that the future volume which the authors promise in the 
preface will be based on a serious study of all the sources that are involved. 
The present volume hardly deserves its title ‘Studies’ in the exact sciences in 
medieval Islam. 


JAN P. HOGENDIJK 
Institut fiir Geschichte der Naturwissenschaften, Frankfurt am Main 
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The author of this book calls himself ‘an irresponsible dilettante, an ié1dastn¢, 
who follows Rabelais’s naughty counsel: fay ce que vouldras. I have sought a 
few pretty pebbles on the shore washed by the great ocean of beauty that 
mathematical science affords’ (p. viii). He regards himself as a mathema- 
tician and, in fact, his command of modern analysis is astonishing, which 
makes for rigour and depth but also restricts the audience of his technical 
publications to the upper crust. Luckily for us this book is not one of those. 

Actually Truesdell is not a mathematician but a physicist and historian. 
More precisely, he is the foremost authority on classical physics— 
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particularly mechanics and thermodynamics—and its history. He not only 
masters mechanics and thermodynamics—not to speak of statistical mech- 
anics and electrodynamics—but he has also made important contributions - 
to these disciplines, as well as to their history. In fact, Truesdell and his 
students, coworkers and compagnons de route have overhauled classical 
physics to a point where their founders might not recognize it: we can now 
speak of neoclassical physics. This has been the result of cultivating the field 
with the help of powerful mathematical tools and of exploring systematically 
a region that had been neglected for centuries, namely the laws of materials 
of special kinds, or constitutive ‘relations’ (actually equations or inequa- 
tions), the simplest and best known of which are Hooke’s (in elasticity) and 
Ohm’s (in electricity). 

As if this were not enough, Truesdell is the undisputed authority on 
Euler, and the man who succeeded in unravelling what he calls ‘the 
tragicomical history’ of thermodynamics. He is also the founder and current 
editor of the prestigious Archive for Rational Mechanics and Analysis and of 
the Archive for History of Exact Sctences. Not content with having done all 
this, Truesdell has now turned his basilisk’s glance to the philosophy of 
science and, in particular, of mechanics. What he saw here did not amuse 
him, as will be apparent in a moment. 

The book under review is a collection of 42 essays on the history and 
philosophy of the exact sciences, most of which have appeared in specialized 
journals. One of the two dealing with the philosophy of science is titled 
‘Suppesian stews’ (pp. 503-579), and is probably the most interesting to 
readers of this journal. This essay, previously unpublished, undertakes to 
demolish the entire line initiated in 1953 by the late logician J. C. C. 
McKinsey and his then students A. C. Sugar and Patrick Suppes, and 
continued by the ‘structuralist’ school of J. Sneed, W. Stegmiller, C. U. 
Moulines, and a few others, including R. Montague and E. W. Adams. 

Truesdell has no difficulty in showing that the mechanics and thermo- 
dynamics that the ‘structuralists’ claimed to have reconstructed are only 
caricatures of the real sciences of mechanics and thermodynamics. To begin 
with, they concentrate on point mechanics, a tiny fragment of mechanics, 
and on thermostatics, a small fraction of thermodynamics. This enables 
them to ignore all questions of constitutive ‘relations’ and irreversible 
processes. But of course this is no excuse for ignoring reference frames and 
units, which they do consistently. Moreover, they have ignored everything 
that happened in mechanics since Newton and in thermodynamics since 
Duhem. 

(I may add that the ‘structuralists’ have no use for semantic hypotheses, or 
‘correspondence rules’, and hold the strange belief that model theory—a 
branch of logic that takes care of the semantics of pure mathematics—can 
double as the semantics of empirical science. This may help explain their 
mistaking theoretical physics for mathematical physics, as is apparent from 
the title of Sneed’s book, The Logical Structure of Mathematical Physics.) 
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‘Truesdell becomes particularly indignant when commenting ‘on the 
passages where the ‘structuralists’ declare arrogantly to be specialists in 
mechanics and thermodynamics, and to have brought to these fields the 
rigour of Bourbaki. He deflates these claims and notes mistake after mistake. 
He also includes excerpts from the correspondence he exchanged in 1952 
with McKinsey, which show the latter to be remarkably unwilling to listen 
to criticisms of genuine experts on the matter. Why then did Truesdell 
communicate and publish that fateful paper? The explanation is in the note 
that he appended to the paper published in 1953: 


The communicator [i.e. C.T.] is in complete disagreement with the view of classical 
mechanics expressed in this article. He agrees, however, that strict axiomatization of 
general mechanics—not merely the degenerate and conceptually insignificant special 
case of particle mechanics—is urgently required. While he does not believe the 
present work achieves any progress whatever toward the precision of the concept of 
force, which always has been and remains still the central conceptual problem, 
indeed the only one not essentially trivial, in the foundations of classical mechanics, 
he hopes that publication of this paper may arouse the interest of students of 
mechanics and logic alike, thus perhaps leading eventually to a proper solution of this 
outstanding but neglected problem. (p. 527) 


Little did Truesdell suspect that such tolerance on his part would help set 
up a new variety of scholasticism. He could not possibly anticipate the 
enthusiasm of so many philosophers over the ‘structuralist’ school, the 
productions of which are not only remote from science but also charac- 
terized by ‘pompousness, undisciplined rambling, verbosity, and frenetic 
vagueness’ (p. 570). 

So, there is no doubt that the axiomatic foundation of physical theories is 
an outstanding problem, and that a necessary condition for doing valuable 
work on it is familiarity with the theories concerned. Such familiarity has 
enabled Walter Noll, a former student of Truesdell’s, to make decisive 
contributions to the foundations of neoclassical continuum mechanics. 
He added several axioms, streamlined the mathematical formalism, and 
obtained several new theorems. Other authors, who usually publish in 
the Archive for Rational Mechanics and Analysis, have generalized Noll’s 
work, and still others have applied it to materials of special kinds. 

In this reviewer’s opinion the work done by this remarkable school over 
the past three decades must be supplemented with sets of explicit semantic 
assumptions (not operational ‘definitions’) pointing to the referents of the 
basic predicates. This is necessary to avoid mistaking physics for a branch of 
mathematics (hence in no fear of experiment), as well as to avoid non-realist 
(in particular subjectivist and conventionalist) interpretations. If only the 
mathematical formalism of a scientific theory is presented, such mis- 
interpretations are hardly avoided. They are seen to be mistaken the 
moment one adds that the formulas concerned are supposed to represent or 
model such and such material entities, and that such and such experiments 
confirm the theory in a certain range. 
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There is much more in this book that should draw the attention of the 
philosopher of science. This reviewer’s own favourites are ‘Experience, 
theory and experiment’, ‘Statistical mechanics and continuum mechanics’, 
‘The role of mathematics in science as exemplified by the work of the 
Bernouillis and Euler’, ‘Conceptual analysis’, ‘Is there a philosophy of 
science?’, “The scholar: a species threatened by professions’, and “The 
computer: ruin of science and threat to mankind’. The historian of science 
has much to learn from this book, and he may be surprised to see some of his 
heroes demoted, while some heretofore obscure individuals are brought to 
prominence. And everyone will be now delighted, now enraged, by the 
opinions of the most heterodox of conservatives and the most traditionalist 
of innovators: Clifford Truesdell, the red tory of physics. 


MARIO BUNGE 
Foundations & Philosophy of Science Unit, McGill University, Montreal 


STRAWSON, P. F. [1985]: Scepticism and Naturalism: Some Vartettes. 
Methuen. x+98 pp. £10.95. 


This book is taken from the Woodbridge Lectures, given in 1983 at 
Columbia. In it Strawson, painting with a broad but masterly brush, 
sketches a whole landscape of philosophical concerns—the external world, 
morality, mind-body connections and meaning. His linking thread is 
scepticism and the various possible strategies for answering or, better, 
sidestepping it. 

He considers first scepticism about the external world. Here he argues 
that a number of well known attempts to answer the sceptic directly 
(Descartes, Kant, Moore, Carnap, Quine) all fail in the task which they 
have, at least ostensibly, set themselves. Carnap cannot show that the 
sceptic’s question is senseless; from the sceptic’s point of view Moore will 
seem a dogmatist; and transcendental arguments can at best prove only that 
it must seem to us as if external objects existed. 

But, says Strawson, the correct response to all this is not to look for better 
proofs of the external world but rather to note that the sceptic’s reasons for 

- doubting it, and equally his opponents’ arguments for belief in it, are quite 
idle. We have no option but to accept that material objects outside us exist. 
Hume (in one mood) and Wittgenstein (in On Certainty) are, at this point in 
Strawson’s discussion, illuminatingly compared as proponents of a ‘natura- 
lism’ which starts from recognition of this sort of epistemological fact about 
us. 

In the later chapters ‘naturalism’ of another sort appears on the scene, 
namely a reductive tendency to give ontological primacy to the ‘natural’ 
world i.e. roughly the spatio-temporal object of scientific investigation. 

MM 
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Although Strawson does not discuss this explicitly, what the two sub- 
varieties of naturalism have in common seems to be commitment to starting 
from what is, or at least seems to be, given to us as the facts about how things 
are—as opposed, perhaps, to starting from some idea of how things must or 
ought to be. There is a clear link here with Strawson’s preference for 
descriptive as opposed to revisionary metaphysics. 

The two sorts of naturalism are labelled, respectively, soft (or humanistic) 
and hard (or scientific). As we shall see, they can clash. Attempts to follow 
the hard line seems to lead to a requirement for drastic revision of our 
conceptual scheme. But soft naturalism is going to resist these revisions—on 
the grounds, roughly, that we just cannot, in fact, accept them. Strawson’s 
sympathies are with this latter move; and the label ‘thoroughgoing’ which he 
at one point applies to soft naturalism hints at one of the reasons for this 
preference; soft naturalism is arguably a more consistent stance in that it 
takes account of facts about us as well as about the ‘natural’ world of science. 

But let us see in more detail how the story unfolds. 

Hard naturalism is an ally of further species of scepticism (or at least of a 
set of philosophical positions akin to scepticism) in that it undermines 
straightforward realist endorsement of claims about the existence of 
secondary qualities, moral properties, psychological states and abstract 
objects. None of these things are to be found in the ‘natural’ world of 
science. 

These ‘scepticisms’ cannot be dismissed as ‘idle’, in the way that external 
world scepticism can. Two viewpoints are, as Strawson admits, genuinely 
available here, the participant, ‘realist’? one and the detached ‘reductive’ or 
sceptical one. But, he insists, we should not say that the detached viewpoint 
is the correct one, from which we see the world as it really is—the secondary 
qualities, moral properties seemingly visible to the participant being merely 
illusory. We should not say this because we cannot occupy that detached 
viewpoint all the time; we have no choice (again a natural or given fact about 
us) but to enter into a variety of human relations and activities and to see the 
world in terms of the moral, psychological and intensional categories, 
operation with which is definitive of these activities. Moreover, there is no 
third standpoint from which one could adjudicate between the claims of the 
other two. Rather we must say that both viewpoints are proper ones to 
occupy. And consequently we must ‘acquiesce in the appropriate relativi- 
sations of our conception of the realities of the case’ (p. 41). 

Three comments on all of this. 

First it is not clear that Strawson has said enough to see off the sceptic 
completely. Suppose the sceptic retreated to a more sophisticated position 
where he did not claim to doubt, say, the external world but merely 
professed to have intellectual qualms about finding himself so irrevocably 
committed to beliefs which, he discovers, he cannot justify? Taking this 
worry seriously (‘We ought to have justification for all our beliefs’) can set 
philosophers off on just the same (failed) lines of argument as the attempts to 
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provide a direct answer to groundlevel scepticism. And Strawson’s em- 
phasis on the unavoidability of commitment to the external world (or, 
mutatis mutandis, moral properties, meanings or whatever) does nothing to 
prevent this happening. But the Wittgenstein of On Certainty surely does 
have something to offer here, namely reflections on the mistaken notions of 
knowledge and justification which propensity to pose the new sceptical 
worry presupposes. I do not think that we can claim scepticism to be 
properly placed until these further considerations are also brought into the 
picture. : 

Second, in speaking of ‘relativisation of our conception of the real to a 
viewpoint’ Strawson has sketched the shape of a possible solution rather 
than providing the detailed workings. What ts a ‘viewpoint’? How can reality 
be relative? How does ‘relativising to a viewpoint’ differ from saying that the 
debated statements are (implicitly) relational? And if we said this would it 
not be to admit the hard reductivist’s case? The committed reductive 
naturalist will want these questions answered before he will be shifted. And 
he will also, I think, want an answer to the contention (mentioned but not 
explicitly answered by Strawson) that his account is better because more all 
embracing than the opposed view; he may well claim that he can explain how 
the (pseudo) statements of the soft naturalist view come to be made. 

But thirdly and finally one must say that, however much one might want 
to argue further on this or that point, or to see more said in clarification, the 
book overall is splendidly mellow and judicious. It is full of illuminating 
juxtapositions and penetrating diagnoses. The only pity is that it is 
published rather expensively in hardback and so may well not reach as wide 
an audience as it deserves. 


JANE HEAL 
University of Newcastle Upon Tyne 


GREGORY, RICHARD [1984]: Mind in Science. Cambridge University Press. 
xutt+641 pp. 


In discussing the relation between perception and knowledge in Galileo’s 
Dialogue concerning the Two Chief World Systems, Richard Gregory remarks 
“There is much here to ponder.’ The same can be said of Gregory’s Mind in 
Science itself. 

Mind in Science presents the reader with a dazzling array of topics. They 
range from catastrophe and mind in Babylonian myth to contemporary 
research at the Bell Telephone Laboratories. In physics Gregory covers 
topics as diverse as Lucretius’ views on space and modern Gauge theories. 
But it is in psychology that Gregory is at his pedagogic best, offering lucid 
and informed discussions of Helmholtz, Freud, Pavlov, S. S. Stevens, and 
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Skinner. His exposition of some areas of psychology, e.g. the Weber- 
Fechner law, are as good as I have encountered anywhere. 

Gregory sets out in Mind in Science to use history to establish two 
philosophical theses. The first is that the use of tools and technologies 
liberated man from superstition and myth and helped establish science and 
speculative thinking. ‘One might guess also that the powers of metal tools 
showed that the ancient ways of the Gods could be challenged and improved 
by tools, planned technology and critical creative thinking. I suspect that 
these are the primary sources of philosophy and science.’ (p. 21). Further, it 
is via analogies with the concepts we employ to describe our technologies 
that man has come to understand himself. When generalized, these 
principles provide the conceptual bases for our descriptions of ourselves. 
Thus the importance of the mechanized chronometer for earlier philoso- 
phers, and the electronic digital computer in our generation. 

His second thesis, and the one of most interest to philosophers of science, 
is that there is a crucial interface between science and mind: the predictive 
hypothesis. Our theme, he states, is to ‘try to discover the relation of Science 
to Mind; to understand Mind in terms of the nature of Science; and perhaps 
to understand Science through understanding Mind, with hypothesis as the 
central linking notion.’ (p. 395). Perceptions, even consciousness, are 
considered as being predictive hypotheses, essentially similar to hypotheses 
in science. 

Itis this second thesis with which I will be most concerned. Several points 
in the text, however, deserve special mention first. For reasons none too 
clear, Gregory jumps into the nature-nurture controversy by including an 
insightful and well-argued section on intelligence testing and the Binet- 
Simon Intelligence Quotient scale. The ‘paradox’ of JQ tests, he argues, 
arises when we attempt to reconcile the following three propositions: 


1. Intelligence is not supposed to be increased by education. 
2. Abilities are supposed to be increased by education. 
3. Intelligence is measured by abilities. 


The resolution of the paradox is to distinguish between ‘potential’ and 
‘kinetic’ intelligence. The former involves the stored information we already 
have from past problem-solving and experience. The latter, however, 
concerns the mind’s ability to fill in gaps and arrive at novel solutions when 
past experience proves inadequate for the task in hand. To measure 
intelligence we must determine the contribution of each and this, he argues, 
the Binet-Simon test fails to do. The paradox is resolved by realizing that 
knowledge is intelligence of an important kind; and knowledge can be 
increased by education. 

Also central to an objective measure of intelligence is determining the 
complexity of the tasks or problems. This is not simply difficult, as 
simultaneously measuring kinetic and potential intelligence, it is im- 
possible. Typical of Gregory’s anecdotal style, he writes 
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... we may cite a famous mathematical joke, which is probably true, concerning the 
distinguished mathematician John von Newmann and the fly problem. There are 
two cyclists a mile apart, cycling towards each other and each going at 10 miles per 
hour. A fly flies from the nose of one cyclist to the nose of the other . . . until they 
meet. The fly flies at 15 miles per hour. How far has the fly flown [when they meet]? 
When he [von Neumann] was asked . . . the mathematician thought for several 
moments before coming up with an answer, . . . It turned out he had performed a 
remarkable feat, by not seeing the easy way. He computed the distance that the fly 
flew as the limit of a series (which is beyond almost anyone to do mentally) . . . (p. 
298) 


Was von Neumann being incredibly intelligent or was he being stupid in not 
seeing the obvious method? Should ‘complexity’ refer to the methods used, 
or to intrinsic features of the problem? The point of the anecdote is that, as 
far as intelligence is concerned, such a distinction is arbitrary and thus no 
objective measure of complexity can exist. 

A second point which deserves mention is Gregory’s attitude towards 
philosophy and philosophers. While admonjshing philosophers for seeing 
conceptual and logical problems where he (in this case, quite literally) sees 
empirical ones, he nevertheless treats philosophers as indispensable co- 
workers in the endeavour to understand mind. In addition to being 
genuinely sympathetic to philosophy, Gregory reads philosophy with an eye 
for understanding what motivated the questions asked, and the answers 
offered. In this respect he is a refreshing break from much contemporary 
scholarship concerned with argument and refutation. Philosophers have 
asked too much in seeking certainty, he believes, and have often employed 
wrong methods in their quest for knowledge—but they are not the splendid 
irrelevancies which many psychologists have thought. 

Having said this, however, it should be remarked that few will find 
Gregory’s sketches of the major philosophers satisfying. They are too short 
and often focus artificially on only one aspect of the philosopher’s work (e.g. 
Descartes’ doubt). Of course physicists and psychologists may feel the same 
about their sections of the book; there being so few polymaths of Gregory’s 
stature, however, this is hardly a general drawback of the book. Nearly 
everyone should learn something from Mind tn Sctence. 

Gregory maintains that the predictive hypothesis is crucial to both science 
and mind. We ‘see and learn only by positing and refuting hypotheses’, he 
states in the Pretext. Also central to science and mind is Baconian Induction 
(including the use of prior and subjective probabilities) and employment of 
crucial experiments. To see how these themes converge, consider his 
discussion of Classical Conditioning experiments. 


The normal requirement of many trials to establish the link is exactly what Francis 
Bacon advocated for useful scientific induction. The fact that animals can learn 
associations by a single trial strongly suggests, I believe, that the usual requirement 
of many trials for associative learning is due not to physiological limitations . . . but 
rather to the need for several trials to establish relevant predictive power. This is 
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necessary because most stimuli that occur are not associated with particular events, 
and so have no predictive power. (p. 277) 


This shows that the physical stimulus is less important than the prediction 
that is built up by the organism. The stimulus-response pair generates a 
predictive hypothesis of the form ‘If S then R’. In usual associative learning 
many trials are needed to establish the predictive hypothesis. When the 
stimulus is particularly salient, however, the prediction may follow im- 
mediately. Indeed, in some cases when the stimuli are regularly spaced in 
time, the response appears before the stimulus; i.e., when S is anticipated, R 
follows automatically. Gregory’s conclusion is that ‘conditioning provides 
the first layers of generalizations from events—essentially by Induction— 
and that these data are used with more sophisticated organization (and 
meaning) to give cognitive behavior and understanding. This is again very 
much what happened in the history of science.’ (p. 283). 

Learning is not the only cognitive process where the predictive hypothesis 
is important: perception or seeing also requires prediction and inference. 
Gregory’s scientific and philosophical mentor here is, of course, Herman 
von Helmholtz, who first postulated unconscious inferences in perception. 
But Gregory’s view far surpasses Helmholtz’ for what involved inferences 
for the latter becomes inferences for the former. The crucial point for 
Gregory is that animals and humans predict from limited sensed data what is 
actually being seen or perceived. Perceptions are hypotheses framed in part 
from the sensed content of our experiences, but also from our past learning 
and expectations: thus the importance of prior and subjective probabilities. 
‘In fact we have every reason to believe,’ he writes, ‘that perceptions have 
their richness and integrity as well as their predictive power through 
inference.’ (p. 388). 

Gregory illustrates this point through detailed examination of visual 
illusions and ambiguities such as Jastrow’s duck-rabbit and Boring’s wife- 
mistress. What is seen in the illusion depends not only on what is given 
‘upwards’ in the stimulus patterns themselves, but also on ‘top-down’ 
processing which the subject provides according to his stored data, past 
experiences, what is anticipated, and so on. When the stimulus patterns are 
incomplete, our perception interpolates the data to ‘fill in’ the necessary gaps 
and uses the completed perception to extrapolate to future states. 

The analogy with the history of science and Baconian Induction is thus 
complete. The differences between Gregory and Popper hardly require 
mention except, perhaps, to say that what gives added plausibility to 
Gregory’s account of mind and science is that it parallels accounts of 
biological evolution. Organisms with strong inductive mechanisms would 
have an obvious advantage over ones which didn’t. Indeed, the ability to 
recognize a predator, say, from as little data as possible would be highly 
selected for. A refuted hypothesis may advance science, but reliable and 
efficient inductions are more likely to advance one’s survival. 
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If I have general criticisms of Mind in Sctence, they are these. While there 
is undoubtedly much truth in Gregory’s predictive hypothesis model, 
perception and consciousness are notoriously multilevelled phenomena and 
explanations which ignore this or concentrate too narrowly their domain of 
discourse perforce leave questions unanswered. There are at least three 
levels, for instance, that a machine carrying out an information processing 
task must be understood: the computational theory, the representation and 
algorithm, and the hardware implementation. Perceptual illusion or re- 
versal, e.g. the Necker cube, may be understood on one level (the represen- 
tational) as involving two interpretations or hypotheses. But on another 
level (the hardware implementation) it must be understood in terms of 
bistable neural networks (i.e. ones with two distinct stable states), While 
Gregory realizes this and comments on it extensively, he fails, I think, to 
appreciate the importance of different types of explanations at different 
levels of analysis. 

Finally the text itself is often unaccountably digressive and unevenly 
written. In a work of this scope and erudition, this is perhaps unavoidable; 
however a more felicitous editing would have improved much of the book. 
These minor objections notwithstanding, those with serious interest in the 
history and philosophy of science should read Mind in Science. 


JAMES ANDERSON 
University of Pennsylvania 


TILES, MARY [1984]: Bachelard: Science and Objectivity. Cambridge 
University Press. xxii+242 pp. (ISBN o—521-24803-5 hard covers; 
o—521—28973—4 paperback). 


It is no secret that a widespread concern in contemporary philosophy of 
science is how to comprehend the conceptual change imposed by the 
evolution of scientific theory. Three approaches to this well known problem 
can be briefly distinguished. First, there is the Cartesian strategy, which, in 
virtue of its foundationalism, is unable to accommodate conceptual change. 
Second, there is the relativism associated, for instance, with Kuhn and 
Feyerabend, whose more extreme form is evident in Rorty’s epistemological 
despair. Third, there is the view of conceptual change as an integral part of 
the scientific process, for example in the thought of Gaston Bachelard, the 
French philosopher. 

Although Bachelard is widely influential in French thought, he is scarcely 
familiar in the English language discussion. This excellent introduction 
to his philosophy of science, the first such critical study in English, 
is specifically intended for an analytic audience. The result is a clear, 
intelligent examination, accessible to the non-specialist, which constantly 
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compares and contrasts Bachelard’s views with those of analytic philoso- 
phers of science understood in a broad sense, including Wittgenstein and 
especially Frege. The discussion is useful in setting out the main lines of 
Bachelard’s anti-Cartesian, anti-Euclidean, anti-Baconian approach. Dr. 
Tiles’ main point is that ultimately the perspectives of Bachelard and 
analytic philosophy differ not only about philosophy of science, but even on 
how to approach science. 

Dr. Tiles begins by noting Duhem’s influence, in different ways, on both 
analytic philosophy of science and Bachelard. According to the latter, the 
Cartesian view is not relevant to contemporary science, which continues to 
progress towards objective knowledge in the absence of foundations. She 
particularly stresses that epistemological rupture (coupure épistémologique) is 
anormal part of the scientific process in order to escape from relativism. Dr. 
Tiles elaborates this point in the discussion of non-Euclidean mathematics 
and the rationality of science. She elucidates the role of mathematics in 
providing the framework for scientific thought and discusses the cardinal 
difference between Bachelard’s anti-Cartesian view and Fregean logical 
foundationalism. In virtue of the Fregean separation of justification and 
discovery, it is unable to model conceptual change. 

In an account of non-Baconian science and conceptual change, Dr. Tiles 
examines Bachelard’s understanding of the transition from common sense 
to science, for instance through the abandonment of the immediately given 
and through the use of deduction in inductive science. She provides a good 
comparison of the treatment of conceptual change in Frege and Bachelard, 
followed by description and analysis of the latter’s view of the historical 
development of the concept of mass. She also brings out well Bachelard’s 
belief that a turning point in any science occurs when rationally organized 
theory supplants the more empirical study from which it arises. 

‘The most interesting phase of the discussion concerns the epistemology of 
(scientific) revolution in relation to realism and instrumentalism. Here Dr. 
‘Tiles notes the utility of Bachelard’s stress on science as a dynamic process 
in understanding the alteration of our view of reality and its study in this 
century as compatible with cognitive progress. Through the incorporation 
of major cognitive change into the nature of science, from his anti-Cartesian 
perspective he offers an alternative to the views of Feyerabend, Kuhn 
and others. The author illustrates this thesis in terms of causality and 
the application of mathematics. She further shows the importance of 
Bachelard’s insistence on the role of the rational agent and its relation to 
Kant’s view of the subjective structuring of the world of experience. 

In conclusion, this is a fine introduction to Bachelard’s philosophy of 
science. My only reservation is minor, that is, that Dr. Tiles is too hesitant in 
failing to draw the conclusion which follows directly from her analysis. Her 
discussion shows that Bachelard’s approach differs most significantly from 
analytic philosophy of science in general in its intent and capacity to 
integrate conceptual change within the process of scientific development. 
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But for this important point, she unfortunately substitutes a correct, 
but weaker statement concerning the incommensurability of different 
theoretical perspectives. 


TOM ROCKMORE 
Fordham University 


ANNOUNCEMENT 


LAKATOS AWARD 


Philosophy of Science 


The closing date for nominations for the second annual 
Lakatos Award is 15 April, 1987. The value of the Award 
will be £10,000. The Award will be for an outstanding 
contribution to the philosophy of science in the form of a 
book published in English during the last ten years (that is, in 
1977 or later). Candidates must be nominated by at least three 
people of recognised professional standing. Nominators 
should give their grounds for the nomination and indicate the 
candidate’s age, since a preference may be given to younger 
scholars. It will be appreciated if three copies of the book are 
provided. Nominations should be marked ‘Lakatos Award’ 
and addressed to: The Secretary, The London School of 
Economics and Political Science, Houghton Street, London 
WC2A 2AE. 


The Award is endowed by the Latsis Foundation and 
administered, on behalf of The London School of Economics, 
by a Committee consisting of the Director of the School, or 
his deputy, as chairman, and Professors Hans Albert, Adolf 
Griinbaum, Alan Musgrave and John Watkins. The 
Committee will make the Award on the advice of an 
independent panel of selectors. 


The recipient will be expected to visit the School, and there 
deliver a public lecture of interest to a general audience. 





ANNOUNCEMENT 


Newton’s Philosophical 
and 
Scientific Legacy 


The Department of Philosophy of the Faculty of 
‘Science of the University of Nijmegen (The 
Netherlands) is pleased to announce an international 
congress, ‘‘NEWTON’S PHILOSOPHICAL AND 
SCIENTIFIC LEGACY’’, to celebrate the 
tercentenary of the publication of Newton’s Pricipia. 
The congress will be held from June 9-12, 1987. 
Invited speakers will be (among others): 


G. CHRISTIANSON (Indiana State), 

I. BERNARD COHEN (Harvard), 

B. J. T. DOBBS (Northwestern), and 

RICHARD H. POPKIN (Washington University, St. 
Louis, Missouri) 


Contributed papers are invited 
For more information, please write to 


Department of Philosophy, 
Faculty of Science, 
University of Nijmegen, 
Toernooiveld, 

Nijmegen, 

The Netherlands 





8th International Congress of 


LOGIC, METHODOLOGY &. 
PHILOSOPHY OF SCIENCE 


MOSCOW, U.S.S.R., 17-22 AUGUST, 1987 


Sections: 


1. Foundation of Mathematical Reasoning (including 
proof theory, category theory and the general 
philosophy of mathematics) 

. Model Theory 

. Foundation of Computing and Recursion Theory 
(including the foundation of computer science and its 
relation to logic) 

. Set Theory 

. General Logic (philosophical logic, various forms of 
non-classical logic with applications to philosophy and 
science) 

. General Methodology of Science 

. Foundation of Probability and Statistical Inference 

. Foundation of Physical Sciences 

. Foundation of Biological Sciences 

. Foundation of Psychology and Cognitive Science 

. Foundation of Social Sciences 

. Foundation of Linguistics 

. History of Logic, Methodology and Philosophy of 
Science. 


All communications concerning the Congress to Prof. I. T. 
Frolow or Dr. A. L. Blinov, Soviet Organising Committee, 
Institute of Philosophy, Volkhonka 14, 119842 Moscow, USSR. 
Telephone in Moscow: 2037165. 





BRITISH SOCIETY FOR THE PHILOSOPHY OF SCIENCE 
and 


DEPT. OF HISTORY AND PHILOSOPHY OF SCIENCE, 
KING‘S COLLEGE, LONDON 


Joint meeting to celebrate the 300th Anniversary of the 
publication of 


DESCARTES 


Discourse de la Methode 


Speakers will include: 
DESMOND M CLARKE, University College, Cork 
JR MILTON, Imperial College, London 
MARY TILES, Royal Institute of Philosophy 
JEAN ROBERT ARMOGATHE, University of Paris 


on 
WEDNESDAY, MARCH 25, 1987—2.15-5.15pm 


at 
KING‘S COLLEGE, STRAND, LONDON 


— ALL WELCOME — 


further details may be obtained from: 


DR. GEORGE ROSS 

DEPT OF PHYSICS 

KING‘S COLLEGE, STRAND, LONDON 
(01) 836 5454 Ex. 2320 


PLEASE NOTE CHANGE OF DATE FROM B.S.P.S. PROGRAMME 





dialectica vol. 40, 1986 


Fasc. 1 Contents Sommaire Inhalt 

Clive Stroud-Drinkwater, Seeing and Following Some Rules; Pierre Thibaud, La notion 
peircéenne d’objet d’un signe 

Notes et Discussions — G. James Jason, Epistertologies and Apologies; Gilles Granger, 
Nécessité ou contingence; Guy Hirsch, Abraham Robinson — Selected Papers 

Etudes critiques - Ferdinand Gonseth, Pour une philosophie dialectique ouverte à l'expé- 
rience (Paul Emile Pilet) 


Fasc. 2 Contents Sommaire Inhalt 


Daniel Laurier, Nouvelles catégories pour |’analyse du sens du locuteur; Bengt-Olof 
Qvarnstrém, Quine’s Theory of Observation Sentence Understanding and His Inscruta- 
bility Thesis; S. R. Palmquist, Six Perspectives on the Object in Kant’s Theory of 
Knowledge 

Notes et Discussions -~ W.J. Holly, On Donald Davidson’s First Person Authority, 
Werner Loh, Feh/deutungen der klassischen Aussagenlogik 


Subscriptions Abonnements Abonnemente Switzerland Other countries 

Paymentin in other currencies 
Subscription rate per annum (4 issues) SFr. G, £ ete.) 
Abonnement annuel (4 fascicules) 65.-SFr. 80.-SFr. +8.-SFr. 
Jahresabonnement (4 Hefte) 


Distributlon/ Auslieferung 

Dialectica, Case postale 1081, 2501 Bienne (Suisse) 

F.W. Faxon, Stechert Coordinator, 15 Southwest Park, Westwood/Mass. 02090 USA 
B.H.-Blackweil Ltd., Broad Street, Oxford, England 


History of Science is published quarterly in issues of about 112 
pages. 


Volume 25 (1987) will include 


A. I. SABRA Greek Science in Medieval Islam 

NORTON Wise AND CROSBE SMITH Work and Waste: Political 
Economy and Natural Philosophy in Nineteenth Century Britain 

ALBERT ROTHENBERG Einstein, Bohr, and Creative Thinking in 
Science 

MALCOLM NICOLSON Alexander von Humboldt, Humboldtian 
Science and the Origins of the Study of Vegetation 

PninA G. ABR-AM The Biotheoretical Gathering, Transdiscip- 
linary Authority and Molecular Biology in the 1930s 

L. S. Jacyna Medical Science and Moral Science: Physiology in 
Restoration France 

JAN GOLINSKI Metzger and Seventeenth Century Chemistry 

J. R. R. Cnreiste Narrative and Rhetoric in Metzger’s 
Historiography 


The Annual Subscription is US $85.00 post-free in the Americas and Japan, 

£42.50 elsewhere ($42.00/£20.50 direct to private subscribers). Write to: 

Science History Publications Ltd, Halfpenny Furze, Mill Lane, Chalfont St 
Giles, Bucks HP8 4NR, England 








Philosophy and Phenomenological Research 


AN INTERNATIONAL QUARTERLY FOUNDED BY MARVIN FARBER 


ERNEST SOSA, Editor 
RODERICK M. CHISHOLM, Associate Editor 


in cooperation with a distinguished group of American and foreign scholars 


PPR publishes articles in a wide range of areas including philosophy 
of mind, epistemology, ethics, metaphysics, and philosophical history 
of philosophy. No specific methodology or philosophical orientation 
is required in submissions. An abstract not exceeding 250 words and 
two nonreturnable copies of any manuscript submitted for publication 
are required. 


Address All Communications to: 
Philosophy and Phenomenological Research 
BROWN UNIVERSITY, BOX 1947 
PROVIDENCE, RHODE ISLAND 02912 


Annual subscription rates: $17.00 (foreign, $18.50) for libraries and institutions $13.00 for individuals; single copies 
$4.25 (double issue, $8.50) and $3.25 (double issue, $6.50), respectively. There is an additional charge for postage 
and handling on all back sssues. A circular lisung the main contents of the journal and a list of available back issues 
will be sent upon request. 


the review of 


metaphysics 


a philosophical quarterly 


JUNE 1986 | VOL. XXXIX, No. 4 | ISSUE No. 156 | $8.00 
articles 
E. M. ADAMS The Human Substance 
JOSEPH MARGOLIS Constraints on the Metaphysics of Culture 
KENNETH L. SCHMITZ Metaphysics: Radical, Comprehensive, Determinate 
Discourse 
M. M. SCHUSTER Is the Fiow of Time Subjective? 
A. SHALOM Culture and Psychoanalysis 
DOROTHEA FREDE The Impossibility of Perfection: Socrates’ criticism of 
Simonides’ Poem in the Protagoras 
JANICE FLOERSCH and Summaries and Comments 
Staff 
Philosophical abstracts announcements ‘ 
Individual Subscriptions $21 00 Institutional Subscription $32.00 Student Subscriptions $10.007. 


Jude P. Dougherty, Editor 
The Catholic University of America. Washington, D.C. 20064 


